• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

tarantool / luajit / 24031030227

06 Apr 2026 11:55AM UTC coverage: 93.024% (-0.04%) from 93.062%
24031030227

push

github

Buristan
cmake: fixup tests build for old CMake

This patch is a follow-up to the commit
1e5887d88 ("FFI: Avoid dangling
cts->L."). The `target_link_options()` feature is introduced in the
CMake 3.13. Unfortunately, we have distros in Tarantool's CI with an
older default CMake version.

This patch workarounds this by the passing the LINK_FLAGS directly.

5716 of 6056 branches covered (94.39%)

Branch coverage included in aggregate %.

21822 of 23547 relevant lines covered (92.67%)

3973282.71 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

76.88
/src/lj_opt_fold.c
1
/*
2
** FOLD: Constant Folding, Algebraic Simplifications and Reassociation.
3
** ABCelim: Array Bounds Check Elimination.
4
** CSE: Common-Subexpression Elimination.
5
** Copyright (C) 2005-2017 Mike Pall. See Copyright Notice in luajit.h
6
*/
7

8
#define lj_opt_fold_c
9
#define LUA_CORE
10

11
#include <math.h>
12

13
#include "lj_obj.h"
14

15
#if LJ_HASJIT
16

17
#include "lj_buf.h"
18
#include "lj_str.h"
19
#include "lj_tab.h"
20
#include "lj_ir.h"
21
#include "lj_jit.h"
22
#include "lj_ircall.h"
23
#include "lj_iropt.h"
24
#include "lj_trace.h"
25
#if LJ_HASFFI
26
#include "lj_ctype.h"
27
#include "lj_carith.h"
28
#endif
29
#include "lj_vm.h"
30
#include "lj_strscan.h"
31
#include "lj_strfmt.h"
32

33
/* Here's a short description how the FOLD engine processes instructions:
34
**
35
** The FOLD engine receives a single instruction stored in fins (J->fold.ins).
36
** The instruction and its operands are used to select matching fold rules.
37
** These are applied iteratively until a fixed point is reached.
38
**
39
** The 8 bit opcode of the instruction itself plus the opcodes of the
40
** two instructions referenced by its operands form a 24 bit key
41
** 'ins left right' (unused operands -> 0, literals -> lowest 8 bits).
42
**
43
** This key is used for partial matching against the fold rules. The
44
** left/right operand fields of the key are successively masked with
45
** the 'any' wildcard, from most specific to least specific:
46
**
47
**   ins left right
48
**   ins any  right
49
**   ins left any
50
**   ins any  any
51
**
52
** The masked key is used to lookup a matching fold rule in a semi-perfect
53
** hash table. If a matching rule is found, the related fold function is run.
54
** Multiple rules can share the same fold function. A fold rule may return
55
** one of several special values:
56
**
57
** - NEXTFOLD means no folding was applied, because an additional test
58
**   inside the fold function failed. Matching continues against less
59
**   specific fold rules. Finally the instruction is passed on to CSE.
60
**
61
** - RETRYFOLD means the instruction was modified in-place. Folding is
62
**   retried as if this instruction had just been received.
63
**
64
** All other return values are terminal actions -- no further folding is
65
** applied:
66
**
67
** - INTFOLD(i) returns a reference to the integer constant i.
68
**
69
** - LEFTFOLD and RIGHTFOLD return the left/right operand reference
70
**   without emitting an instruction.
71
**
72
** - CSEFOLD and EMITFOLD pass the instruction directly to CSE or emit
73
**   it without passing through any further optimizations.
74
**
75
** - FAILFOLD, DROPFOLD and CONDFOLD only apply to instructions which have
76
**   no result (e.g. guarded assertions): FAILFOLD means the guard would
77
**   always fail, i.e. the current trace is pointless. DROPFOLD means
78
**   the guard is always true and has been eliminated. CONDFOLD is a
79
**   shortcut for FAILFOLD + cond (i.e. drop if true, otherwise fail).
80
**
81
** - Any other return value is interpreted as an IRRef or TRef. This
82
**   can be a reference to an existing or a newly created instruction.
83
**   Only the least-significant 16 bits (IRRef1) are used to form a TRef
84
**   which is finally returned to the caller.
85
**
86
** The FOLD engine receives instructions both from the trace recorder and
87
** substituted instructions from LOOP unrolling. This means all types
88
** of instructions may end up here, even though the recorder bypasses
89
** FOLD in some cases. Thus all loads, stores and allocations must have
90
** an any/any rule to avoid being passed on to CSE.
91
**
92
** Carefully read the following requirements before adding or modifying
93
** any fold rules:
94
**
95
** Requirement #1: All fold rules must preserve their destination type.
96
**
97
** Consistently use INTFOLD() (KINT result) or lj_ir_knum() (KNUM result).
98
** Never use lj_ir_knumint() which can have either a KINT or KNUM result.
99
**
100
** Requirement #2: Fold rules should not create *new* instructions which
101
** reference operands *across* PHIs.
102
**
103
** E.g. a RETRYFOLD with 'fins->op1 = fleft->op1' is invalid if the
104
** left operand is a PHI. Then fleft->op1 would point across the PHI
105
** frontier to an invariant instruction. Adding a PHI for this instruction
106
** would be counterproductive. The solution is to add a barrier which
107
** prevents folding across PHIs, i.e. 'PHIBARRIER(fleft)' in this case.
108
** The only exception is for recurrences with high latencies like
109
** repeated int->num->int conversions.
110
**
111
** One could relax this condition a bit if the referenced instruction is
112
** a PHI, too. But this often leads to worse code due to excessive
113
** register shuffling.
114
**
115
** Note: returning *existing* instructions (e.g. LEFTFOLD) is ok, though.
116
** Even returning fleft->op1 would be ok, because a new PHI will added,
117
** if needed. But again, this leads to excessive register shuffling and
118
** should be avoided.
119
**
120
** Requirement #3: The set of all fold rules must be monotonic to guarantee
121
** termination.
122
**
123
** The goal is optimization, so one primarily wants to add strength-reducing
124
** rules. This means eliminating an instruction or replacing an instruction
125
** with one or more simpler instructions. Don't add fold rules which point
126
** into the other direction.
127
**
128
** Some rules (like commutativity) do not directly reduce the strength of
129
** an instruction, but enable other fold rules (e.g. by moving constants
130
** to the right operand). These rules must be made unidirectional to avoid
131
** cycles.
132
**
133
** Rule of thumb: the trace recorder expands the IR and FOLD shrinks it.
134
*/
135

136
/* Some local macros to save typing. Undef'd at the end. */
137
#define IR(ref)                (&J->cur.ir[(ref)])
138
#define fins                (&J->fold.ins)
139
#define fleft                (J->fold.left)
140
#define fright                (J->fold.right)
141
#define knumleft        (ir_knum(fleft)->n)
142
#define knumright        (ir_knum(fright)->n)
143

144
/* Pass IR on to next optimization in chain (FOLD). */
145
#define emitir(ot, a, b)        (lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
146

147
/* Fold function type. Fastcall on x86 significantly reduces their size. */
148
typedef IRRef (LJ_FASTCALL *FoldFunc)(jit_State *J);
149

150
/* Macros for the fold specs, so buildvm can recognize them. */
151
#define LJFOLD(x)
152
#define LJFOLDX(x)
153
#define LJFOLDF(name)        static TRef LJ_FASTCALL fold_##name(jit_State *J)
154
/* Note: They must be at the start of a line or buildvm ignores them! */
155

156
/* Barrier to prevent using operands across PHIs. */
157
#define PHIBARRIER(ir)        if (irt_isphi((ir)->t)) return NEXTFOLD
158

159
/* Barrier to prevent folding across a GC step.
160
** GC steps can only happen at the head of a trace and at LOOP.
161
** And the GC is only driven forward if there's at least one allocation.
162
*/
163
#define gcstep_barrier(J, ref) \
164
  ((ref) < J->chain[IR_LOOP] && \
165
   (J->chain[IR_SNEW] || J->chain[IR_XSNEW] || \
166
    J->chain[IR_TNEW] || J->chain[IR_TDUP] || \
167
    J->chain[IR_CNEW] || J->chain[IR_CNEWI] || \
168
    J->chain[IR_BUFSTR] || J->chain[IR_TOSTR] || J->chain[IR_CALLA]))
169

170
/* -- Constant folding for FP numbers ------------------------------------- */
171

172
LJFOLD(ADD KNUM KNUM)
173
LJFOLD(SUB KNUM KNUM)
174
LJFOLD(MUL KNUM KNUM)
175
LJFOLD(DIV KNUM KNUM)
176
LJFOLD(LDEXP KNUM KNUM)
177
LJFOLD(MIN KNUM KNUM)
178
LJFOLD(MAX KNUM KNUM)
179
LJFOLDF(kfold_numarith)
1,520✔
180
{
181
  lua_Number a = knumleft;
1,520✔
182
  lua_Number b = knumright;
1,520✔
183
  lua_Number y = lj_vm_foldarith(a, b, fins->o - IR_ADD);
1,520✔
184
  return lj_ir_knum(J, y);
1,520✔
185
}
186

187
LJFOLD(NEG KNUM FLOAD)
188
LJFOLD(ABS KNUM FLOAD)
189
LJFOLDF(kfold_numabsneg)
21✔
190
{
191
  lua_Number a = knumleft;
21✔
192
  lua_Number y = lj_vm_foldarith(a, a, fins->o - IR_ADD);
21✔
193
  return lj_ir_knum(J, y);
21✔
194
}
195

196
LJFOLD(LDEXP KNUM KINT)
197
LJFOLDF(kfold_ldexp)
×
198
{
199
#if LJ_TARGET_X86ORX64
200
  UNUSED(J);
×
201
  return NEXTFOLD;
×
202
#else
203
  return lj_ir_knum(J, ldexp(knumleft, fright->i));
204
#endif
205
}
206

207
LJFOLD(FPMATH KNUM any)
208
LJFOLDF(kfold_fpmath)
15✔
209
{
210
  lua_Number a = knumleft;
15✔
211
  lua_Number y = lj_vm_foldfpm(a, fins->op2);
15✔
212
  return lj_ir_knum(J, y);
15✔
213
}
214

215
LJFOLD(CALLN KNUM any)
216
LJFOLDF(kfold_fpcall1)
5✔
217
{
218
  const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
5✔
219
  if (CCI_TYPE(ci) == IRT_NUM) {
5✔
220
    double y = ((double (*)(double))ci->func)(knumleft);
5✔
221
    return lj_ir_knum(J, y);
5✔
222
  }
223
  return NEXTFOLD;
224
}
225

226
LJFOLD(CALLN CARG IRCALL_atan2)
227
LJFOLDF(kfold_fpcall2)
1✔
228
{
229
  if (irref_isk(fleft->op1) && irref_isk(fleft->op2)) {
1✔
230
    const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
1✔
231
    double a = ir_knum(IR(fleft->op1))->n;
1✔
232
    double b = ir_knum(IR(fleft->op2))->n;
1✔
233
    double y = ((double (*)(double, double))ci->func)(a, b);
1✔
234
    return lj_ir_knum(J, y);
1✔
235
  }
236
  return NEXTFOLD;
237
}
238

239
LJFOLD(POW KNUM KNUM)
240
LJFOLDF(kfold_numpow)
10,269✔
241
{
242
  return lj_ir_knum(J, lj_vm_foldarith(knumleft, knumright, IR_POW - IR_ADD));
10,269✔
243
}
244

245
/* Must not use kfold_kref for numbers (could be NaN). */
246
LJFOLD(EQ KNUM KNUM)
247
LJFOLD(NE KNUM KNUM)
248
LJFOLD(LT KNUM KNUM)
249
LJFOLD(GE KNUM KNUM)
250
LJFOLD(LE KNUM KNUM)
251
LJFOLD(GT KNUM KNUM)
252
LJFOLD(ULT KNUM KNUM)
253
LJFOLD(UGE KNUM KNUM)
254
LJFOLD(ULE KNUM KNUM)
255
LJFOLD(UGT KNUM KNUM)
256
LJFOLDF(kfold_numcomp)
6✔
257
{
258
  return CONDFOLD(lj_ir_numcmp(knumleft, knumright, (IROp)fins->o));
6✔
259
}
260

261
/* -- Constant folding for 32 bit integers -------------------------------- */
262

263
#if LUAJIT_USE_UBSAN
264
/* Cdata arithmetic depends on the interger overflow. */
265
static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
266
  __attribute__((no_sanitize("signed-integer-overflow")));
267
#endif
268
static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
6,314✔
269
{
270
  switch (op) {
6,314✔
271
  case IR_ADD: k1 += k2; break;
4,474✔
272
  case IR_SUB: k1 -= k2; break;
3✔
273
  case IR_MUL: k1 *= k2; break;
×
274
  case IR_MOD: k1 = lj_vm_modi(k1, k2); break;
19✔
275
  case IR_NEG: k1 = (int32_t)(~(uint32_t)k1+1u); break;
×
276
  case IR_BAND: k1 &= k2; break;
1✔
277
  case IR_BOR: k1 |= k2; break;
×
278
  case IR_BXOR: k1 ^= k2; break;
×
279
  case IR_BSHL: k1 <<= (k2 & 31); break;
39✔
280
  case IR_BSHR: k1 = (int32_t)((uint32_t)k1 >> (k2 & 31)); break;
11✔
281
  case IR_BSAR: k1 >>= (k2 & 31); break;
×
282
  case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 31)); break;
1✔
283
  case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 31)); break;
×
284
  case IR_MIN: k1 = k1 < k2 ? k1 : k2; break;
2✔
285
  case IR_MAX: k1 = k1 > k2 ? k1 : k2; break;
1,764✔
286
  default: lj_assertX(0, "bad IR op %d", op); break;
287
  }
288
  return k1;
6,314✔
289
}
290

291
LJFOLD(ADD KINT KINT)
292
LJFOLD(SUB KINT KINT)
293
LJFOLD(MUL KINT KINT)
294
LJFOLD(MOD KINT KINT)
295
LJFOLD(NEG KINT KINT)
296
LJFOLD(BAND KINT KINT)
297
LJFOLD(BOR KINT KINT)
298
LJFOLD(BXOR KINT KINT)
299
LJFOLD(BSHL KINT KINT)
300
LJFOLD(BSHR KINT KINT)
301
LJFOLD(BSAR KINT KINT)
302
LJFOLD(BROL KINT KINT)
303
LJFOLD(BROR KINT KINT)
304
LJFOLD(MIN KINT KINT)
305
LJFOLD(MAX KINT KINT)
306
LJFOLDF(kfold_intarith)
2,585✔
307
{
308
  return INTFOLD(kfold_intop(fleft->i, fright->i, (IROp)fins->o));
2,585✔
309
}
310

311
LJFOLD(ADDOV KINT KINT)
312
LJFOLD(SUBOV KINT KINT)
313
LJFOLD(MULOV KINT KINT)
314
LJFOLDF(kfold_intovarith)
12,587✔
315
{
316
  lua_Number n = lj_vm_foldarith((lua_Number)fleft->i, (lua_Number)fright->i,
25,174✔
317
                                 fins->o - IR_ADDOV);
12,587✔
318
  int32_t k = lj_num2int(n);
12,587✔
319
  if (n != (lua_Number)k)
12,587✔
320
    return FAILFOLD;
321
  return INTFOLD(k);
12,587✔
322
}
323

324
LJFOLD(BNOT KINT)
325
LJFOLDF(kfold_bnot)
×
326
{
327
  return INTFOLD(~fleft->i);
×
328
}
329

330
LJFOLD(BSWAP KINT)
331
LJFOLDF(kfold_bswap)
×
332
{
333
  return INTFOLD((int32_t)lj_bswap((uint32_t)fleft->i));
×
334
}
335

336
LJFOLD(LT KINT KINT)
337
LJFOLD(GE KINT KINT)
338
LJFOLD(LE KINT KINT)
339
LJFOLD(GT KINT KINT)
340
LJFOLD(ULT KINT KINT)
341
LJFOLD(UGE KINT KINT)
342
LJFOLD(ULE KINT KINT)
343
LJFOLD(UGT KINT KINT)
344
LJFOLD(ABC KINT KINT)
345
LJFOLDF(kfold_intcomp)
1,413✔
346
{
347
  int32_t a = fleft->i, b = fright->i;
1,413✔
348
  switch ((IROp)fins->o) {
1,413✔
349
  case IR_LT: return CONDFOLD(a < b);
63✔
350
  case IR_GE: return CONDFOLD(a >= b);
130✔
351
  case IR_LE: return CONDFOLD(a <= b);
730✔
352
  case IR_GT: return CONDFOLD(a > b);
116✔
353
  case IR_ULT: return CONDFOLD((uint32_t)a < (uint32_t)b);
×
354
  case IR_UGE: return CONDFOLD((uint32_t)a >= (uint32_t)b);
×
355
  case IR_ULE: return CONDFOLD((uint32_t)a <= (uint32_t)b);
42✔
356
  case IR_ABC:
332✔
357
  case IR_UGT: return CONDFOLD((uint32_t)a > (uint32_t)b);
332✔
358
  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
359
  }
360
}
361

362
LJFOLD(UGE any KINT)
363
LJFOLDF(kfold_intcomp0)
61✔
364
{
365
  if (fright->i == 0)
61✔
366
    return DROPFOLD;
37✔
367
  return NEXTFOLD;
368
}
369

370
/* -- Constant folding for 64 bit integers -------------------------------- */
371

372
static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
373
                                 IROp op)
374
{
375
  UNUSED(J);
376
#if LJ_HASFFI
377
  switch (op) {
378
  case IR_ADD: k1 += k2; break;
379
  case IR_SUB: k1 -= k2; break;
380
  case IR_MUL: k1 *= k2; break;
381
  case IR_BAND: k1 &= k2; break;
382
  case IR_BOR: k1 |= k2; break;
383
  case IR_BXOR: k1 ^= k2; break;
384
  case IR_BSHL: k1 <<= (k2 & 63); break;
385
  case IR_BSHR: k1 >>= (k2 & 63); break;
386
  case IR_BSAR: k1 = (uint64_t)((int64_t)k1 >> (k2 & 63)); break;
387
  case IR_BROL: k1 = lj_rol(k1, (k2 & 63)); break;
388
  case IR_BROR: k1 = lj_ror(k1, (k2 & 63)); break;
389
  default: lj_assertJ(0, "bad IR op %d", op); break;
390
  }
391
#else
392
  UNUSED(k2); UNUSED(op);
393
  lj_assertJ(0, "FFI IR op without FFI");
394
#endif
395
  return k1;
396
}
397

398
LJFOLD(ADD KINT64 KINT64)
399
LJFOLD(SUB KINT64 KINT64)
400
LJFOLD(MUL KINT64 KINT64)
401
LJFOLD(BAND KINT64 KINT64)
402
LJFOLD(BOR KINT64 KINT64)
403
LJFOLD(BXOR KINT64 KINT64)
404
LJFOLDF(kfold_int64arith)
169✔
405
{
406
  return INT64FOLD(kfold_int64arith(J, ir_k64(fleft)->u64,
169✔
407
                                    ir_k64(fright)->u64, (IROp)fins->o));
408
}
409

410
LJFOLD(DIV KINT64 KINT64)
411
LJFOLD(MOD KINT64 KINT64)
412
LJFOLD(POW KINT64 KINT64)
413
LJFOLDF(kfold_int64arith2)
×
414
{
415
#if LJ_HASFFI
416
  uint64_t k1 = ir_k64(fleft)->u64, k2 = ir_k64(fright)->u64;
×
417
  if (irt_isi64(fins->t)) {
×
418
    k1 = fins->o == IR_DIV ? lj_carith_divi64((int64_t)k1, (int64_t)k2) :
×
419
         fins->o == IR_MOD ? lj_carith_modi64((int64_t)k1, (int64_t)k2) :
×
420
                             lj_carith_powi64((int64_t)k1, (int64_t)k2);
×
421
  } else {
422
    k1 = fins->o == IR_DIV ? lj_carith_divu64(k1, k2) :
×
423
         fins->o == IR_MOD ? lj_carith_modu64(k1, k2) :
×
424
                             lj_carith_powu64(k1, k2);
×
425
  }
426
  return INT64FOLD(k1);
×
427
#else
428
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
429
#endif
430
}
431

432
LJFOLD(BSHL KINT64 KINT)
433
LJFOLD(BSHR KINT64 KINT)
434
LJFOLD(BSAR KINT64 KINT)
435
LJFOLD(BROL KINT64 KINT)
436
LJFOLD(BROR KINT64 KINT)
437
LJFOLDF(kfold_int64shift)
×
438
{
439
#if LJ_HASFFI
440
  uint64_t k = ir_k64(fleft)->u64;
×
441
  int32_t sh = (fright->i & 63);
×
442
  return INT64FOLD(lj_carith_shift64(k, sh, fins->o - IR_BSHL));
×
443
#else
444
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
445
#endif
446
}
447

448
LJFOLD(BNOT KINT64)
449
LJFOLDF(kfold_bnot64)
×
450
{
451
#if LJ_HASFFI
452
  return INT64FOLD(~ir_k64(fleft)->u64);
×
453
#else
454
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
455
#endif
456
}
457

458
LJFOLD(BSWAP KINT64)
459
LJFOLDF(kfold_bswap64)
×
460
{
461
#if LJ_HASFFI
462
  return INT64FOLD(lj_bswap64(ir_k64(fleft)->u64));
×
463
#else
464
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
465
#endif
466
}
467

468
LJFOLD(LT KINT64 KINT64)
469
LJFOLD(GE KINT64 KINT64)
470
LJFOLD(LE KINT64 KINT64)
471
LJFOLD(GT KINT64 KINT64)
472
LJFOLD(ULT KINT64 KINT64)
473
LJFOLD(UGE KINT64 KINT64)
474
LJFOLD(ULE KINT64 KINT64)
475
LJFOLD(UGT KINT64 KINT64)
476
LJFOLDF(kfold_int64comp)
×
477
{
478
#if LJ_HASFFI
479
  uint64_t a = ir_k64(fleft)->u64, b = ir_k64(fright)->u64;
×
480
  switch ((IROp)fins->o) {
×
481
  case IR_LT: return CONDFOLD((int64_t)a < (int64_t)b);
×
482
  case IR_GE: return CONDFOLD((int64_t)a >= (int64_t)b);
×
483
  case IR_LE: return CONDFOLD((int64_t)a <= (int64_t)b);
×
484
  case IR_GT: return CONDFOLD((int64_t)a > (int64_t)b);
×
485
  case IR_ULT: return CONDFOLD(a < b);
×
486
  case IR_UGE: return CONDFOLD(a >= b);
×
487
  case IR_ULE: return CONDFOLD(a <= b);
×
488
  case IR_UGT: return CONDFOLD(a > b);
×
489
  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
490
  }
491
#else
492
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
493
#endif
494
}
495

496
LJFOLD(UGE any KINT64)
497
LJFOLDF(kfold_int64comp0)
17,960✔
498
{
499
#if LJ_HASFFI
500
  if (ir_k64(fright)->u64 == 0)
17,960✔
501
    return DROPFOLD;
×
502
  return NEXTFOLD;
503
#else
504
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
505
#endif
506
}
507

508
/* -- Constant folding for strings ---------------------------------------- */
509

510
LJFOLD(SNEW KKPTR KINT)
511
LJFOLDF(kfold_snew_kptr)
6✔
512
{
513
  GCstr *s = lj_str_new(J->L, (const char *)ir_kptr(fleft), (size_t)fright->i);
6✔
514
  return lj_ir_kstr(J, s);
6✔
515
}
516

517
LJFOLD(SNEW any KINT)
518
LJFOLDF(kfold_snew_empty)
14✔
519
{
520
  if (fright->i == 0)
14✔
521
    return lj_ir_kstr(J, &J2G(J)->strempty);
×
522
  return NEXTFOLD;
523
}
524

525
LJFOLD(STRREF KGC KINT)
526
LJFOLDF(kfold_strref)
113✔
527
{
528
  GCstr *str = ir_kstr(fleft);
113✔
529
  lj_assertJ((MSize)fright->i <= str->len, "bad string ref");
113✔
530
  return lj_ir_kkptr(J, (char *)strdata(str) + fright->i);
113✔
531
}
532

533
LJFOLD(STRREF SNEW any)
534
LJFOLDF(kfold_strref_snew)
3✔
535
{
536
  PHIBARRIER(fleft);
3✔
537
  if (irref_isk(fins->op2) && fright->i == 0) {
3✔
538
    return fleft->op1;  /* strref(snew(ptr, len), 0) ==> ptr */
2✔
539
  } else {
540
    /* Reassociate: strref(snew(strref(str, a), len), b) ==> strref(str, a+b) */
541
    IRIns *ir = IR(fleft->op1);
1✔
542
    if (ir->o == IR_STRREF) {
1✔
543
      IRRef1 str = ir->op1;  /* IRIns * is not valid across emitir. */
×
544
      PHIBARRIER(ir);
×
545
      fins->op2 = emitir(IRTI(IR_ADD), ir->op2, fins->op2); /* Clobbers fins! */
×
546
      fins->op1 = str;
×
547
      fins->ot = IRT(IR_STRREF, IRT_PGC);
×
548
      return RETRYFOLD;
×
549
    }
550
  }
551
  return NEXTFOLD;
552
}
553

554
LJFOLD(CALLN CARG IRCALL_lj_str_cmp)
555
LJFOLDF(kfold_strcmp)
2✔
556
{
557
  if (irref_isk(fleft->op1) && irref_isk(fleft->op2)) {
2✔
558
    GCstr *a = ir_kstr(IR(fleft->op1));
×
559
    GCstr *b = ir_kstr(IR(fleft->op2));
×
560
    return INTFOLD(lj_str_cmp(a, b));
×
561
  }
562
  return NEXTFOLD;
563
}
564

565
/* -- Constant folding and forwarding for buffers ------------------------- */
566

567
/*
568
** Buffer ops perform stores, but their effect is limited to the buffer
569
** itself. Also, buffer ops are chained: a use of an op implies a use of
570
** all other ops up the chain. Conversely, if an op is unused, all ops
571
** up the chain can go unsed. This largely eliminates the need to treat
572
** them as stores.
573
**
574
** Alas, treating them as normal (IRM_N) ops doesn't work, because they
575
** cannot be CSEd in isolation. CSE for IRM_N is implicitly done in LOOP
576
** or if FOLD is disabled.
577
**
578
** The compromise is to declare them as loads, emit them like stores and
579
** CSE whole chains manually when the BUFSTR is to be emitted. Any chain
580
** fragments left over from CSE are eliminated by DCE.
581
*/
582

583
/* BUFHDR is emitted like a store, see below. */
584

585
LJFOLD(BUFPUT BUFHDR BUFSTR)
586
LJFOLDF(bufput_append)
51✔
587
{
588
  /* New buffer, no other buffer op inbetween and same buffer? */
589
  if ((J->flags & JIT_F_OPT_FWD) &&
51✔
590
      !(fleft->op2 & IRBUFHDR_APPEND) &&
51✔
591
      fleft->prev == fright->op2 &&
51✔
592
      fleft->op1 == IR(fright->op2)->op1 &&
36✔
593
      !(irt_isphi(fright->t) && IR(fright->op2)->prev)) {
33✔
594
    IRRef ref = fins->op1;
32✔
595
    IR(ref)->op2 = (fleft->op2 | IRBUFHDR_APPEND);  /* Modify BUFHDR. */
32✔
596
    IR(ref)->op1 = fright->op1;
32✔
597
    return ref;
32✔
598
  }
599
  return EMITFOLD;  /* Always emit, CSE later. */
19✔
600
}
601

602
LJFOLD(BUFPUT any any)
603
LJFOLDF(bufput_kgc)
1,878✔
604
{
605
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && fright->o == IR_KGC) {
1,878✔
606
    GCstr *s2 = ir_kstr(fright);
1,010✔
607
    if (s2->len == 0) {  /* Empty string? */
1,010✔
608
      return LEFTFOLD;
17✔
609
    } else {
610
      if (fleft->o == IR_BUFPUT && irref_isk(fleft->op2) &&
993✔
611
          !irt_isphi(fleft->t)) {  /* Join two constant string puts in a row. */
235✔
612
        GCstr *s1 = ir_kstr(IR(fleft->op2));
235✔
613
        IRRef kref = lj_ir_kstr(J, lj_buf_cat2str(J->L, s1, s2));
235✔
614
        /* lj_ir_kstr() may realloc the IR and invalidates any IRIns *. */
615
        IR(fins->op1)->op2 = kref;  /* Modify previous BUFPUT. */
234✔
616
        return fins->op1;
234✔
617
      }
618
    }
619
  }
620
  return EMITFOLD;  /* Always emit, CSE later. */
1,626✔
621
}
622

623
LJFOLD(BUFSTR any any)
624
LJFOLDF(bufstr_kfold_cse)
1,046✔
625
{
626
  lj_assertJ(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
1,046✔
627
             fleft->o == IR_CALLL,
628
             "bad buffer constructor IR op %d", fleft->o);
629
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
1,046✔
630
    if (fleft->o == IR_BUFHDR) {  /* No put operations? */
1,036✔
631
      if (!(fleft->op2 & IRBUFHDR_APPEND))  /* Empty buffer? */
2✔
632
        return lj_ir_kstr(J, &J2G(J)->strempty);
1✔
633
      fins->op1 = fleft->op1;
1✔
634
      fins->op2 = fleft->prev;  /* Relies on checks in bufput_append. */
1✔
635
      return CSEFOLD;
1✔
636
    } else if (fleft->o == IR_BUFPUT) {
1,034✔
637
      IRIns *irb = IR(fleft->op1);
715✔
638
      if (irb->o == IR_BUFHDR && !(irb->op2 & IRBUFHDR_APPEND))
715✔
639
        return fleft->op2;  /* Shortcut for a single put operation. */
116✔
640
    }
641
  }
642
  /* Try to CSE the whole chain. */
643
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
928✔
644
    IRRef ref = J->chain[IR_BUFSTR];
920✔
645
    while (ref) {
3,900✔
646
      IRIns *irs = IR(ref), *ira = fleft, *irb = IR(irs->op1);
3,030✔
647
      while (ira->o == irb->o && ira->op2 == irb->op2) {
3,765✔
648
        lj_assertJ(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
796✔
649
                   ira->o == IR_CALLL || ira->o == IR_CARG,
650
                   "bad buffer constructor IR op %d", ira->o);
651
        if (ira->o == IR_BUFHDR && !(ira->op2 & IRBUFHDR_APPEND))
796✔
652
          return ref;  /* CSE succeeded. */
50✔
653
        if (ira->o == IR_CALLL && ira->op2 == IRCALL_lj_buf_puttab)
746✔
654
          break;
655
        ira = IR(ira->op1);
735✔
656
        irb = IR(irb->op1);
735✔
657
      }
658
      ref = irs->prev;
2,980✔
659
    }
660
  }
661
  return EMITFOLD;  /* No CSE possible. */
878✔
662
}
663

664
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_reverse)
665
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_upper)
666
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_lower)
667
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putquoted)
668
LJFOLDF(bufput_kfold_op)
33✔
669
{
670
  if (irref_isk(fleft->op2)) {
33✔
671
    const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
3✔
672
    SBuf *sb = lj_buf_tmp_(J->L);
3✔
673
    sb = ((SBuf * (LJ_FASTCALL *)(SBuf *, GCstr *))ci->func)(sb,
6✔
674
                                                       ir_kstr(IR(fleft->op2)));
3✔
675
    fins->o = IR_BUFPUT;
3✔
676
    fins->op1 = fleft->op1;
3✔
677
    fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
3✔
678
    return RETRYFOLD;
3✔
679
  }
680
  return EMITFOLD;  /* Always emit, CSE later. */
30✔
681
}
682

683
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_rep)
684
LJFOLDF(bufput_kfold_rep)
113✔
685
{
686
  if (irref_isk(fleft->op2)) {
113✔
687
    IRIns *irc = IR(fleft->op1);
6✔
688
    if (irref_isk(irc->op2)) {
6✔
689
      SBuf *sb = lj_buf_tmp_(J->L);
4✔
690
      sb = lj_buf_putstr_rep(sb, ir_kstr(IR(irc->op2)), IR(fleft->op2)->i);
4✔
691
      fins->o = IR_BUFPUT;
4✔
692
      fins->op1 = irc->op1;
4✔
693
      fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
4✔
694
      return RETRYFOLD;
4✔
695
    }
696
  }
697
  return EMITFOLD;  /* Always emit, CSE later. */
109✔
698
}
699

700
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfxint)
701
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum_int)
702
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum_uint)
703
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum)
704
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfstr)
705
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfchar)
706
LJFOLDF(bufput_kfold_fmt)
262✔
707
{
708
  IRIns *irc = IR(fleft->op1);
262✔
709
  lj_assertJ(irref_isk(irc->op2), "SFormat must be const");
262✔
710
  if (irref_isk(fleft->op2)) {
262✔
711
    SFormat sf = (SFormat)IR(irc->op2)->i;
67✔
712
    IRIns *ira = IR(fleft->op2);
67✔
713
    SBuf *sb = lj_buf_tmp_(J->L);
67✔
714
    switch (fins->op2) {
67✔
715
    case IRCALL_lj_strfmt_putfxint:
2✔
716
      sb = lj_strfmt_putfxint(sb, sf, ir_k64(ira)->u64);
2✔
717
      break;
2✔
718
    case IRCALL_lj_strfmt_putfstr:
×
719
      sb = lj_strfmt_putfstr(sb, sf, ir_kstr(ira));
×
720
      break;
×
721
    case IRCALL_lj_strfmt_putfchar:
×
722
      sb = lj_strfmt_putfchar(sb, sf, ira->i);
×
723
      break;
×
724
    case IRCALL_lj_strfmt_putfnum_int:
65✔
725
    case IRCALL_lj_strfmt_putfnum_uint:
726
    case IRCALL_lj_strfmt_putfnum:
727
    default: {
728
      const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
65✔
729
      sb = ((SBuf * (*)(SBuf *, SFormat, lua_Number))ci->func)(sb, sf,
65✔
730
                                                         ir_knum(ira)->n);
731
      break;
65✔
732
      }
733
    }
734
    fins->o = IR_BUFPUT;
67✔
735
    fins->op1 = irc->op1;
67✔
736
    fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
67✔
737
    return RETRYFOLD;
67✔
738
  }
739
  return EMITFOLD;  /* Always emit, CSE later. */
195✔
740
}
741

742
/* -- Constant folding of pointer arithmetic ------------------------------ */
743

744
LJFOLD(ADD KGC KINT)
745
LJFOLD(ADD KGC KINT64)
746
LJFOLDF(kfold_add_kgc)
46✔
747
{
748
  GCobj *o = ir_kgc(fleft);
46✔
749
#if LJ_64
750
  ptrdiff_t ofs = (ptrdiff_t)ir_kint64(fright)->u64;
46✔
751
#else
752
  ptrdiff_t ofs = fright->i;
753
#endif
754
#if LJ_HASFFI
755
  if (irt_iscdata(fleft->t)) {
46✔
756
    CType *ct = ctype_raw(ctype_ctsG(J2G(J)), gco2cd(o)->ctypeid);
41✔
757
    if (ctype_isnum(ct->info) || ctype_isenum(ct->info) ||
41✔
758
        ctype_isptr(ct->info) || ctype_isfunc(ct->info) ||
41✔
759
        ctype_iscomplex(ct->info) || ctype_isvector(ct->info))
41✔
760
      return lj_ir_kkptr(J, (char *)o + ofs);
1✔
761
  }
762
#endif
763
  return lj_ir_kptr(J, (char *)o + ofs);
45✔
764
}
765

766
LJFOLD(ADD KPTR KINT)
767
LJFOLD(ADD KPTR KINT64)
768
LJFOLD(ADD KKPTR KINT)
769
LJFOLD(ADD KKPTR KINT64)
770
LJFOLDF(kfold_add_kptr)
26✔
771
{
772
  void *p = ir_kptr(fleft);
26✔
773
#if LJ_64
774
  ptrdiff_t ofs = (ptrdiff_t)ir_kint64(fright)->u64;
26✔
775
#else
776
  ptrdiff_t ofs = fright->i;
777
#endif
778
  return lj_ir_kptr_(J, fleft->o, (char *)p + ofs);
26✔
779
}
780

781
LJFOLD(ADD any KGC)
782
LJFOLD(ADD any KPTR)
783
LJFOLD(ADD any KKPTR)
784
LJFOLDF(kfold_add_kright)
28✔
785
{
786
  if (fleft->o == IR_KINT || fleft->o == IR_KINT64) {
28✔
787
    IRRef1 tmp = fins->op1; fins->op1 = fins->op2; fins->op2 = tmp;
24✔
788
    return RETRYFOLD;
24✔
789
  }
790
  return NEXTFOLD;
791
}
792

793
/* -- Constant folding of conversions ------------------------------------- */
794

795
LJFOLD(TOBIT KNUM KNUM)
796
LJFOLDF(kfold_tobit)
2✔
797
{
798
  return INTFOLD(lj_num2bit(knumleft));
2✔
799
}
800

801
LJFOLD(CONV KINT IRCONV_NUM_INT)
802
LJFOLDF(kfold_conv_kint_num)
246,944✔
803
{
804
  return lj_ir_knum(J, (lua_Number)fleft->i);
246,944✔
805
}
806

807
LJFOLD(CONV KINT IRCONV_NUM_U32)
808
LJFOLDF(kfold_conv_kintu32_num)
×
809
{
810
  return lj_ir_knum(J, (lua_Number)(uint32_t)fleft->i);
×
811
}
812

813
LJFOLD(CONV KINT IRCONV_INT_I8)
814
LJFOLD(CONV KINT IRCONV_INT_U8)
815
LJFOLD(CONV KINT IRCONV_INT_I16)
816
LJFOLD(CONV KINT IRCONV_INT_U16)
817
LJFOLDF(kfold_conv_kint_ext)
5✔
818
{
819
  int32_t k = fleft->i;
5✔
820
  if ((fins->op2 & IRCONV_SRCMASK) == IRT_I8) k = (int8_t)k;
5✔
821
  else if ((fins->op2 & IRCONV_SRCMASK) == IRT_U8) k = (uint8_t)k;
5✔
822
  else if ((fins->op2 & IRCONV_SRCMASK) == IRT_I16) k = (int16_t)k;
×
823
  else k = (uint16_t)k;
×
824
  return INTFOLD(k);
5✔
825
}
826

827
LJFOLD(CONV KINT IRCONV_I64_INT)
828
LJFOLD(CONV KINT IRCONV_U64_INT)
829
LJFOLD(CONV KINT IRCONV_I64_U32)
830
LJFOLD(CONV KINT IRCONV_U64_U32)
831
LJFOLDF(kfold_conv_kint_i64)
25,951✔
832
{
833
  if ((fins->op2 & IRCONV_SEXT))
25,951✔
834
    return INT64FOLD((uint64_t)(int64_t)fleft->i);
25,944✔
835
  else
836
    return INT64FOLD((uint64_t)(int64_t)(uint32_t)fleft->i);
7✔
837
}
838

839
LJFOLD(CONV KINT64 IRCONV_NUM_I64)
840
LJFOLDF(kfold_conv_kint64_num_i64)
1✔
841
{
842
  return lj_ir_knum(J, (lua_Number)(int64_t)ir_kint64(fleft)->u64);
1✔
843
}
844

845
LJFOLD(CONV KINT64 IRCONV_NUM_U64)
846
LJFOLDF(kfold_conv_kint64_num_u64)
×
847
{
848
  return lj_ir_knum(J, (lua_Number)ir_kint64(fleft)->u64);
×
849
}
850

851
LJFOLD(CONV KINT64 IRCONV_INT_I64)
852
LJFOLD(CONV KINT64 IRCONV_U32_I64)
853
LJFOLDF(kfold_conv_kint64_int_i64)
×
854
{
855
  return INTFOLD((int32_t)ir_kint64(fleft)->u64);
×
856
}
857

858
LJFOLD(CONV KNUM IRCONV_INT_NUM)
859
LJFOLDF(kfold_conv_knum_int_num)
7✔
860
{
861
  lua_Number n = knumleft;
7✔
862
  int32_t k = lj_num2int(n);
7✔
863
  if (irt_isguard(fins->t) && n != (lua_Number)k) {
7✔
864
    /* We're about to create a guard which always fails, like CONV +1.5.
865
    ** Some pathological loops cause this during LICM, e.g.:
866
    **   local x,k,t = 0,1.5,{1,[1.5]=2}
867
    **   for i=1,200 do x = x+ t[k]; k = k == 1 and 1.5 or 1 end
868
    **   assert(x == 300)
869
    */
870
    return FAILFOLD;
871
  }
872
  return INTFOLD(k);
7✔
873
}
874

875
LJFOLD(CONV KNUM IRCONV_U32_NUM)
876
LJFOLDF(kfold_conv_knum_u32_num)
2✔
877
{
878
#ifdef _MSC_VER
879
  {  /* Workaround for MSVC bug. */
880
    volatile uint32_t u = (uint32_t)knumleft;
881
    return INTFOLD((int32_t)u);
882
  }
883
#else
884
  return INTFOLD((int32_t)(uint32_t)knumleft);
2✔
885
#endif
886
}
887

888
LJFOLD(CONV KNUM IRCONV_I64_NUM)
889
LJFOLDF(kfold_conv_knum_i64_num)
11✔
890
{
891
  return INT64FOLD((uint64_t)(int64_t)knumleft);
11✔
892
}
893

894
LJFOLD(CONV KNUM IRCONV_U64_NUM)
895
LJFOLDF(kfold_conv_knum_u64_num)
10,268✔
896
{
897
  return INT64FOLD(lj_num2u64(knumleft));
10,268✔
898
}
899

900
LJFOLD(TOSTR KNUM any)
901
LJFOLDF(kfold_tostr_knum)
10✔
902
{
903
  return lj_ir_kstr(J, lj_strfmt_num(J->L, ir_knum(fleft)));
10✔
904
}
905

906
LJFOLD(TOSTR KINT any)
907
LJFOLDF(kfold_tostr_kint)
10✔
908
{
909
  return lj_ir_kstr(J, fins->op2 == IRTOSTR_INT ?
10✔
910
                       lj_strfmt_int(J->L, fleft->i) :
911
                       lj_strfmt_char(J->L, fleft->i));
912
}
913

914
LJFOLD(STRTO KGC)
915
LJFOLDF(kfold_strto)
47✔
916
{
917
  TValue n;
47✔
918
  if (lj_strscan_num(ir_kstr(fleft), &n))
47✔
919
    return lj_ir_knum(J, numV(&n));
47✔
920
  return FAILFOLD;
921
}
922

923
/* -- Constant folding of equality checks --------------------------------- */
924

925
/* Don't constant-fold away FLOAD checks against KNULL. */
926
LJFOLD(EQ FLOAD KNULL)
927
LJFOLD(NE FLOAD KNULL)
928
LJFOLDX(lj_opt_cse)
929

930
/* But fold all other KNULL compares, since only KNULL is equal to KNULL. */
931
LJFOLD(EQ any KNULL)
932
LJFOLD(NE any KNULL)
933
LJFOLD(EQ KNULL any)
934
LJFOLD(NE KNULL any)
935
LJFOLD(EQ KINT KINT)  /* Constants are unique, so same refs <==> same value. */
936
LJFOLD(NE KINT KINT)
937
LJFOLD(EQ KINT64 KINT64)
938
LJFOLD(NE KINT64 KINT64)
939
LJFOLD(EQ KGC KGC)
940
LJFOLD(NE KGC KGC)
941
LJFOLDF(kfold_kref)
772,152✔
942
{
943
  return CONDFOLD((fins->op1 == fins->op2) ^ (fins->o == IR_NE));
772,152✔
944
}
945

946
/* -- Algebraic shortcuts ------------------------------------------------- */
947

948
LJFOLD(FPMATH FPMATH IRFPM_FLOOR)
949
LJFOLD(FPMATH FPMATH IRFPM_CEIL)
950
LJFOLD(FPMATH FPMATH IRFPM_TRUNC)
951
LJFOLDF(shortcut_round)
×
952
{
953
  IRFPMathOp op = (IRFPMathOp)fleft->op2;
×
954
  if (op == IRFPM_FLOOR || op == IRFPM_CEIL || op == IRFPM_TRUNC)
×
955
    return LEFTFOLD;  /* round(round_left(x)) = round_left(x) */
×
956
  return NEXTFOLD;
957
}
958

959
LJFOLD(ABS ABS FLOAD)
960
LJFOLDF(shortcut_left)
×
961
{
962
  return LEFTFOLD;  /* f(g(x)) ==> g(x) */
×
963
}
964

965
LJFOLD(ABS NEG FLOAD)
966
LJFOLDF(shortcut_dropleft)
×
967
{
968
  PHIBARRIER(fleft);
×
969
  fins->op1 = fleft->op1;  /* abs(neg(x)) ==> abs(x) */
×
970
  return RETRYFOLD;
×
971
}
972

973
/* Note: no safe shortcuts with STRTO and TOSTR ("1e2" ==> +100 ==> "100"). */
974
LJFOLD(NEG NEG any)
975
LJFOLD(BNOT BNOT)
976
LJFOLD(BSWAP BSWAP)
977
LJFOLDF(shortcut_leftleft)
×
978
{
979
  PHIBARRIER(fleft);  /* See above. Fold would be ok, but not beneficial. */
×
980
  return fleft->op1;  /* f(g(x)) ==> x */
×
981
}
982

983
/* -- FP algebraic simplifications ---------------------------------------- */
984

985
/* FP arithmetic is tricky -- there's not much to simplify.
986
** Please note the following common pitfalls before sending "improvements":
987
**   x+0 ==> x  is INVALID for x=-0
988
**   0-x ==> -x is INVALID for x=+0
989
**   x*0 ==> 0  is INVALID for x=-0, x=+-Inf or x=NaN
990
*/
991

992
LJFOLD(ADD NEG any)
993
LJFOLDF(simplify_numadd_negx)
×
994
{
995
  PHIBARRIER(fleft);
×
996
  fins->o = IR_SUB;  /* (-a) + b ==> b - a */
×
997
  fins->op1 = fins->op2;
×
998
  fins->op2 = fleft->op1;
×
999
  return RETRYFOLD;
×
1000
}
1001

1002
LJFOLD(ADD any NEG)
1003
LJFOLDF(simplify_numadd_xneg)
×
1004
{
1005
  PHIBARRIER(fright);
×
1006
  fins->o = IR_SUB;  /* a + (-b) ==> a - b */
×
1007
  fins->op2 = fright->op1;
×
1008
  return RETRYFOLD;
×
1009
}
1010

1011
LJFOLD(SUB any KNUM)
1012
LJFOLDF(simplify_numsub_k)
1,352✔
1013
{
1014
  if (ir_knum(fright)->u64 == 0)  /* x - (+0) ==> x */
1,352✔
1015
    return LEFTFOLD;
885✔
1016
  return NEXTFOLD;
1017
}
1018

1019
LJFOLD(SUB NEG KNUM)
1020
LJFOLDF(simplify_numsub_negk)
×
1021
{
1022
  PHIBARRIER(fleft);
×
1023
  fins->op2 = fleft->op1;  /* (-x) - k ==> (-k) - x */
×
1024
  fins->op1 = (IRRef1)lj_ir_knum(J, -knumright);
×
1025
  return RETRYFOLD;
×
1026
}
1027

1028
LJFOLD(SUB any NEG)
1029
LJFOLDF(simplify_numsub_xneg)
×
1030
{
1031
  PHIBARRIER(fright);
×
1032
  fins->o = IR_ADD;  /* a - (-b) ==> a + b */
×
1033
  fins->op2 = fright->op1;
×
1034
  return RETRYFOLD;
×
1035
}
1036

1037
LJFOLD(MUL any KNUM)
1038
LJFOLD(DIV any KNUM)
1039
LJFOLDF(simplify_nummuldiv_k)
315✔
1040
{
1041
  lua_Number n = knumright;
315✔
1042
  if (n == 1.0) {  /* x o 1 ==> x */
315✔
1043
    return LEFTFOLD;
2✔
1044
  } else if (n == -1.0) {  /* x o -1 ==> -x */
313✔
1045
    IRRef op1 = fins->op1;
×
1046
    fins->op2 = (IRRef1)lj_ir_ksimd(J, LJ_KSIMD_NEG);  /* Modifies fins. */
×
1047
    fins->op1 = op1;
×
1048
    fins->o = IR_NEG;
×
1049
    return RETRYFOLD;
×
1050
  } else if (fins->o == IR_MUL && n == 2.0) {  /* x * 2 ==> x + x */
313✔
1051
    fins->o = IR_ADD;
3✔
1052
    fins->op2 = fins->op1;
3✔
1053
    return RETRYFOLD;
3✔
1054
  } else if (fins->o == IR_DIV) {  /* x / 2^k ==> x * 2^-k */
310✔
1055
    uint64_t u = ir_knum(fright)->u64;
133✔
1056
    uint32_t ex = ((uint32_t)(u >> 52) & 0x7ff);
133✔
1057
    if ((u & U64x(000fffff,ffffffff)) == 0 && ex - 1 < 0x7fd) {
133✔
1058
      u = (u & ((uint64_t)1 << 63)) | ((uint64_t)(0x7fe - ex) << 52);
10✔
1059
      fins->o = IR_MUL;  /* Multiply by exact reciprocal. */
10✔
1060
      fins->op2 = lj_ir_knum_u64(J, u);
10✔
1061
      return RETRYFOLD;
10✔
1062
    }
1063
  }
1064
  return NEXTFOLD;
1065
}
1066

1067
LJFOLD(MUL NEG KNUM)
1068
LJFOLD(DIV NEG KNUM)
1069
LJFOLDF(simplify_nummuldiv_negk)
×
1070
{
1071
  PHIBARRIER(fleft);
×
1072
  fins->op1 = fleft->op1;  /* (-a) o k ==> a o (-k) */
×
1073
  fins->op2 = (IRRef1)lj_ir_knum(J, -knumright);
×
1074
  return RETRYFOLD;
×
1075
}
1076

1077
LJFOLD(MUL NEG NEG)
1078
LJFOLD(DIV NEG NEG)
1079
LJFOLDF(simplify_nummuldiv_negneg)
×
1080
{
1081
  PHIBARRIER(fleft);
×
1082
  PHIBARRIER(fright);
×
1083
  fins->op1 = fleft->op1;  /* (-a) o (-b) ==> a o b */
×
1084
  fins->op2 = fright->op1;
×
1085
  return RETRYFOLD;
×
1086
}
1087

1088
LJFOLD(POW any KNUM)
1089
LJFOLDF(simplify_numpow_k)
4✔
1090
{
1091
  if (knumright == 0)  /* x ^ 0 ==> 1 */
4✔
1092
    return lj_ir_knum_one(J);  /* Result must be a number, not an int. */
×
1093
  else if (knumright == 1)  /* x ^ 1 ==> x */
4✔
1094
    return LEFTFOLD;
×
1095
  else if (knumright == 2)  /* x ^ 2 ==> x * x */
4✔
1096
    return emitir(IRTN(IR_MUL), fins->op1, fins->op1);
×
1097
  else
1098
    return NEXTFOLD;
1099
}
1100

1101
/* -- Simplify conversions ------------------------------------------------ */
1102

1103
LJFOLD(CONV CONV IRCONV_NUM_INT)  /* _NUM */
1104
LJFOLDF(shortcut_conv_num_int)
27✔
1105
{
1106
  PHIBARRIER(fleft);
27✔
1107
  /* Only safe with a guarded conversion to int. */
1108
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_NUM && irt_isguard(fleft->t))
27✔
1109
    return fleft->op1;  /* f(g(x)) ==> x */
8✔
1110
  return NEXTFOLD;
1111
}
1112

1113
LJFOLD(CONV CONV IRCONV_INT_NUM)  /* _INT */
1114
LJFOLD(CONV CONV IRCONV_U32_NUM)  /* _U32*/
1115
LJFOLDF(simplify_conv_int_num)
125✔
1116
{
1117
  /* Fold even across PHI to avoid expensive num->int conversions in loop. */
1118
  if ((fleft->op2 & IRCONV_SRCMASK) ==
125✔
1119
      ((fins->op2 & IRCONV_DSTMASK) >> IRCONV_DSH))
125✔
1120
    return fleft->op1;
23✔
1121
  return NEXTFOLD;
1122
}
1123

1124
LJFOLD(CONV CONV IRCONV_I64_NUM)  /* _INT or _U32 */
1125
LJFOLD(CONV CONV IRCONV_U64_NUM)  /* _INT or _U32 */
1126
LJFOLDF(simplify_conv_i64_num)
103✔
1127
{
1128
  PHIBARRIER(fleft);
103✔
1129
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
102✔
1130
    /* Reduce to a sign-extension. */
1131
    fins->op1 = fleft->op1;
×
1132
    fins->op2 = ((IRT_I64<<5)|IRT_INT|IRCONV_SEXT);
×
1133
    return RETRYFOLD;
×
1134
  } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
102✔
1135
#if LJ_TARGET_X64
1136
    return fleft->op1;
1✔
1137
#else
1138
    /* Reduce to a zero-extension. */
1139
    fins->op1 = fleft->op1;
1140
    fins->op2 = (IRT_I64<<5)|IRT_U32;
1141
    return RETRYFOLD;
1142
#endif
1143
  }
1144
  return NEXTFOLD;
1145
}
1146

1147
LJFOLD(CONV CONV IRCONV_INT_I64)  /* _INT or _U32 */
1148
LJFOLD(CONV CONV IRCONV_INT_U64)  /* _INT or _U32 */
1149
LJFOLD(CONV CONV IRCONV_U32_I64)  /* _INT or _U32 */
1150
LJFOLD(CONV CONV IRCONV_U32_U64)  /* _INT or _U32 */
1151
LJFOLDF(simplify_conv_int_i64)
2✔
1152
{
1153
  int src;
2✔
1154
  PHIBARRIER(fleft);
2✔
1155
  src = (fleft->op2 & IRCONV_SRCMASK);
2✔
1156
  if (src == IRT_INT || src == IRT_U32) {
2✔
1157
    if (src == ((fins->op2 & IRCONV_DSTMASK) >> IRCONV_DSH)) {
2✔
1158
      return fleft->op1;
×
1159
    } else {
1160
      fins->op2 = ((fins->op2 & IRCONV_DSTMASK) | src);
2✔
1161
      fins->op1 = fleft->op1;
2✔
1162
      return RETRYFOLD;
2✔
1163
    }
1164
  }
1165
  return NEXTFOLD;
1166
}
1167

1168
LJFOLD(CONV CONV IRCONV_FLOAT_NUM)  /* _FLOAT */
1169
LJFOLDF(simplify_conv_flt_num)
5✔
1170
{
1171
  PHIBARRIER(fleft);
5✔
1172
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_FLOAT)
5✔
1173
    return fleft->op1;
1✔
1174
  return NEXTFOLD;
1175
}
1176

1177
/* Shortcut TOBIT + IRT_NUM <- IRT_INT/IRT_U32 conversion. */
1178
LJFOLD(TOBIT CONV KNUM)
1179
LJFOLDF(simplify_tobit_conv)
41✔
1180
{
1181
  /* Fold even across PHI to avoid expensive num->int conversions in loop. */
1182
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
41✔
1183
    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
7✔
1184
    return fleft->op1;
7✔
1185
  } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
34✔
1186
    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
2✔
1187
    fins->o = IR_CONV;
2✔
1188
    fins->op1 = fleft->op1;
2✔
1189
    fins->op2 = (IRT_INT<<5)|IRT_U32;
2✔
1190
    return RETRYFOLD;
2✔
1191
  }
1192
  return NEXTFOLD;
1193
}
1194

1195
/* Shortcut floor/ceil/round + IRT_NUM <- IRT_INT/IRT_U32 conversion. */
1196
LJFOLD(FPMATH CONV IRFPM_FLOOR)
1197
LJFOLD(FPMATH CONV IRFPM_CEIL)
1198
LJFOLD(FPMATH CONV IRFPM_TRUNC)
1199
LJFOLDF(simplify_floor_conv)
×
1200
{
1201
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT ||
×
1202
      (fleft->op2 & IRCONV_SRCMASK) == IRT_U32)
1203
    return LEFTFOLD;
×
1204
  return NEXTFOLD;
1205
}
1206

1207
/* Strength reduction of widening. */
1208
LJFOLD(CONV any IRCONV_I64_INT)
1209
LJFOLD(CONV any IRCONV_U64_INT)
1210
LJFOLDF(simplify_conv_sext)
10,715✔
1211
{
1212
  IRRef ref = fins->op1;
10,715✔
1213
  int64_t ofs = 0;
10,715✔
1214
  if (!(fins->op2 & IRCONV_SEXT))
10,715✔
1215
    return NEXTFOLD;
1216
  PHIBARRIER(fleft);
10,688✔
1217
  if (fleft->o == IR_XLOAD && (irt_isu8(fleft->t) || irt_isu16(fleft->t)))
10,684✔
1218
    goto ok_reduce;
×
1219
  if (fleft->o == IR_ADD && irref_isk(fleft->op2)) {
10,684✔
1220
    ofs = (int64_t)IR(fleft->op2)->i;
×
1221
    ref = fleft->op1;
×
1222
  }
1223
  /* Use scalar evolution analysis results to strength-reduce sign-extension. */
1224
  if (ref == J->scev.idx) {
10,684✔
1225
    IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop;
234✔
1226
    lj_assertJ(irt_isint(J->scev.t), "only int SCEV supported");
234✔
1227
    if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) {
234✔
1228
    ok_reduce:
230✔
1229
#if LJ_TARGET_X64
1230
      /* Eliminate widening. All 32 bit ops do an implicit zero-extension. */
1231
      return LEFTFOLD;
230✔
1232
#else
1233
      /* Reduce to a (cheaper) zero-extension. */
1234
      fins->op2 &= ~IRCONV_SEXT;
1235
      return RETRYFOLD;
1236
#endif
1237
    }
1238
  }
1239
  return NEXTFOLD;
1240
}
1241

1242
/* Strength reduction of narrowing. */
1243
LJFOLD(CONV ADD IRCONV_INT_I64)
1244
LJFOLD(CONV SUB IRCONV_INT_I64)
1245
LJFOLD(CONV MUL IRCONV_INT_I64)
1246
LJFOLD(CONV ADD IRCONV_INT_U64)
1247
LJFOLD(CONV SUB IRCONV_INT_U64)
1248
LJFOLD(CONV MUL IRCONV_INT_U64)
1249
LJFOLD(CONV ADD IRCONV_U32_I64)
1250
LJFOLD(CONV SUB IRCONV_U32_I64)
1251
LJFOLD(CONV MUL IRCONV_U32_I64)
1252
LJFOLD(CONV ADD IRCONV_U32_U64)
1253
LJFOLD(CONV SUB IRCONV_U32_U64)
1254
LJFOLD(CONV MUL IRCONV_U32_U64)
1255
LJFOLDF(simplify_conv_narrow)
2✔
1256
{
1257
  IROp op = (IROp)fleft->o;
2✔
1258
  IRType t = irt_type(fins->t);
2✔
1259
  IRRef op1 = fleft->op1, op2 = fleft->op2, mode = fins->op2;
2✔
1260
  PHIBARRIER(fleft);
2✔
1261
  op1 = emitir(IRT(IR_CONV, t), op1, mode);
2✔
1262
  op2 = emitir(IRT(IR_CONV, t), op2, mode);
2✔
1263
  fins->ot = IRT(op, t);
2✔
1264
  fins->op1 = op1;
2✔
1265
  fins->op2 = op2;
2✔
1266
  return RETRYFOLD;
2✔
1267
}
1268

1269
/* Special CSE rule for CONV. */
1270
LJFOLD(CONV any any)
1271
LJFOLDF(cse_conv)
77,698✔
1272
{
1273
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
77,698✔
1274
    IRRef op1 = fins->op1, op2 = (fins->op2 & IRCONV_MODEMASK);
77,698✔
1275
    uint8_t guard = irt_isguard(fins->t);
77,698✔
1276
    IRRef ref = J->chain[IR_CONV];
77,698✔
1277
    while (ref > op1) {
105,256✔
1278
      IRIns *ir = IR(ref);
48,729✔
1279
      /* Commoning with stronger checks is ok. */
1280
      if (ir->op1 == op1 && (ir->op2 & IRCONV_MODEMASK) == op2 &&
48,729✔
1281
          irt_isguard(ir->t) >= guard)
21,171✔
1282
        return ref;
21,171✔
1283
      ref = ir->prev;
27,558✔
1284
    }
1285
  }
1286
  return EMITFOLD;  /* No fallthrough to regular CSE. */
56,527✔
1287
}
1288

1289
/* FP conversion narrowing. */
1290
LJFOLD(TOBIT ADD KNUM)
1291
LJFOLD(TOBIT SUB KNUM)
1292
LJFOLD(CONV ADD IRCONV_INT_NUM)
1293
LJFOLD(CONV SUB IRCONV_INT_NUM)
1294
LJFOLD(CONV ADD IRCONV_I64_NUM)
1295
LJFOLD(CONV SUB IRCONV_I64_NUM)
1296
LJFOLDF(narrow_convert)
10,698✔
1297
{
1298
  PHIBARRIER(fleft);
10,698✔
1299
  /* Narrowing ignores PHIs and repeating it inside the loop is not useful. */
1300
  if (J->chain[IR_LOOP])
10,611✔
1301
    return NEXTFOLD;
1302
  lj_assertJ(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT,
10,565✔
1303
             "unexpected CONV TOBIT");
1304
  return lj_opt_narrow_convert(J);
10,565✔
1305
}
1306

1307
/* -- Integer algebraic simplifications ----------------------------------- */
1308

1309
LJFOLD(ADD any KINT)
1310
LJFOLD(ADDOV any KINT)
1311
LJFOLD(SUBOV any KINT)
1312
LJFOLDF(simplify_intadd_k)
8,477✔
1313
{
1314
  if (fright->i == 0)  /* i o 0 ==> i */
8,477✔
1315
    return LEFTFOLD;
182✔
1316
  return NEXTFOLD;
1317
}
1318

1319
LJFOLD(MULOV any KINT)
1320
LJFOLDF(simplify_intmul_k)
4✔
1321
{
1322
  if (fright->i == 0)  /* i * 0 ==> 0 */
4✔
1323
    return RIGHTFOLD;
×
1324
  if (fright->i == 1)  /* i * 1 ==> i */
4✔
1325
    return LEFTFOLD;
×
1326
  if (fright->i == 2) {  /* i * 2 ==> i + i */
4✔
1327
    fins->o = IR_ADDOV;
×
1328
    fins->op2 = fins->op1;
×
1329
    return RETRYFOLD;
×
1330
  }
1331
  return NEXTFOLD;
1332
}
1333

1334
LJFOLD(SUB any KINT)
1335
LJFOLDF(simplify_intsub_k)
81✔
1336
{
1337
  if (fright->i == 0)  /* i - 0 ==> i */
81✔
1338
    return LEFTFOLD;
26✔
1339
  fins->o = IR_ADD;  /* i - k ==> i + (-k) */
55✔
1340
  fins->op2 = (IRRef1)lj_ir_kint(J, (int32_t)(~(uint32_t)fright->i+1u));  /* Overflow for -2^31 ok. */
55✔
1341
  return RETRYFOLD;
55✔
1342
}
1343

1344
LJFOLD(SUB KINT any)
1345
LJFOLD(SUB KINT64 any)
1346
LJFOLDF(simplify_intsub_kleft)
×
1347
{
1348
  if (fleft->o == IR_KINT ? (fleft->i == 0) : (ir_kint64(fleft)->u64 == 0)) {
×
1349
    fins->o = IR_NEG;  /* 0 - i ==> -i */
×
1350
    fins->op1 = fins->op2;
×
1351
    return RETRYFOLD;
×
1352
  }
1353
  return NEXTFOLD;
1354
}
1355

1356
LJFOLD(ADD any KINT64)
1357
LJFOLDF(simplify_intadd_k64)
31,559✔
1358
{
1359
  if (ir_kint64(fright)->u64 == 0)  /* i + 0 ==> i */
31,559✔
1360
    return LEFTFOLD;
1,615✔
1361
  return NEXTFOLD;
1362
}
1363

1364
LJFOLD(SUB any KINT64)
1365
LJFOLDF(simplify_intsub_k64)
20✔
1366
{
1367
  uint64_t k = ir_kint64(fright)->u64;
20✔
1368
  if (k == 0)  /* i - 0 ==> i */
20✔
1369
    return LEFTFOLD;
×
1370
  fins->o = IR_ADD;  /* i - k ==> i + (-k) */
20✔
1371
  fins->op2 = (IRRef1)lj_ir_kint64(J, ~k+1u);
20✔
1372
  return RETRYFOLD;
20✔
1373
}
1374

1375
static TRef simplify_intmul_k(jit_State *J, int32_t k)
25,610✔
1376
{
1377
  /* Note: many more simplifications are possible, e.g. 2^k1 +- 2^k2.
1378
  ** But this is mainly intended for simple address arithmetic.
1379
  ** Also it's easier for the backend to optimize the original multiplies.
1380
  */
1381
  if (k == 0) {  /* i * 0 ==> 0 */
25,610✔
1382
    return RIGHTFOLD;
×
1383
  } else if (k == 1) {  /* i * 1 ==> i */
25,610✔
1384
    return LEFTFOLD;
19,628✔
1385
  } else if ((k & (k-1)) == 0) {  /* i * 2^k ==> i << k */
5,982✔
1386
    fins->o = IR_BSHL;
5,980✔
1387
    fins->op2 = lj_ir_kint(J, lj_fls((uint32_t)k));
5,980✔
1388
    return RETRYFOLD;
5,980✔
1389
  }
1390
  return NEXTFOLD;
1391
}
1392

1393
LJFOLD(MUL any KINT)
1394
LJFOLDF(simplify_intmul_k32)
×
1395
{
1396
  if (fright->i >= 0)
×
1397
    return simplify_intmul_k(J, fright->i);
×
1398
  return NEXTFOLD;
1399
}
1400

1401
LJFOLD(MUL any KINT64)
1402
LJFOLDF(simplify_intmul_k64)
28,518✔
1403
{
1404
#if LJ_HASFFI
1405
  if (ir_kint64(fright)->u64 < 0x80000000u)
28,518✔
1406
    return simplify_intmul_k(J, (int32_t)ir_kint64(fright)->u64);
25,610✔
1407
  return NEXTFOLD;
1408
#else
1409
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1410
#endif
1411
}
1412

1413
LJFOLD(MOD any KINT)
1414
LJFOLDF(simplify_intmod_k)
77✔
1415
{
1416
  int32_t k = fright->i;
77✔
1417
  lj_assertJ(k != 0, "integer mod 0");
77✔
1418
  if (k > 0 && (k & (k-1)) == 0) {  /* i % (2^k) ==> i & (2^k-1) */
77✔
1419
    fins->o = IR_BAND;
10✔
1420
    fins->op2 = lj_ir_kint(J, k-1);
10✔
1421
    return RETRYFOLD;
10✔
1422
  }
1423
  return NEXTFOLD;
1424
}
1425

1426
LJFOLD(MOD KINT any)
1427
LJFOLDF(simplify_intmod_kleft)
2✔
1428
{
1429
  if (fleft->i == 0)
2✔
1430
    return INTFOLD(0);
×
1431
  return NEXTFOLD;
1432
}
1433

1434
LJFOLD(SUB any any)
1435
LJFOLD(SUBOV any any)
1436
LJFOLDF(simplify_intsub)
79,266✔
1437
{
1438
  if (fins->op1 == fins->op2 && !irt_isnum(fins->t))  /* i - i ==> 0 */
79,266✔
1439
    return irt_is64(fins->t) ? INT64FOLD(0) : INTFOLD(0);
1✔
1440
  return NEXTFOLD;
1441
}
1442

1443
LJFOLD(SUB ADD any)
1444
LJFOLDF(simplify_intsubadd_leftcancel)
10✔
1445
{
1446
  if (!irt_isnum(fins->t)) {
10✔
1447
    PHIBARRIER(fleft);
3✔
1448
    if (fins->op2 == fleft->op1)  /* (i + j) - i ==> j */
2✔
1449
      return fleft->op2;
×
1450
    if (fins->op2 == fleft->op2)  /* (i + j) - j ==> i */
2✔
1451
      return fleft->op1;
×
1452
  }
1453
  return NEXTFOLD;
1454
}
1455

1456
LJFOLD(SUB SUB any)
1457
LJFOLDF(simplify_intsubsub_leftcancel)
128✔
1458
{
1459
  if (!irt_isnum(fins->t)) {
128✔
1460
    PHIBARRIER(fleft);
4✔
1461
    if (fins->op2 == fleft->op1) {  /* (i - j) - i ==> 0 - j */
2✔
1462
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
×
1463
      fins->op2 = fleft->op2;
×
1464
      return RETRYFOLD;
×
1465
    }
1466
  }
1467
  return NEXTFOLD;
1468
}
1469

1470
LJFOLD(SUB any SUB)
1471
LJFOLDF(simplify_intsubsub_rightcancel)
×
1472
{
1473
  if (!irt_isnum(fins->t)) {
×
1474
    PHIBARRIER(fright);
×
1475
    if (fins->op1 == fright->op1)  /* i - (i - j) ==> j */
×
1476
      return fright->op2;
×
1477
  }
1478
  return NEXTFOLD;
1479
}
1480

1481
LJFOLD(SUB any ADD)
1482
LJFOLDF(simplify_intsubadd_rightcancel)
7✔
1483
{
1484
  if (!irt_isnum(fins->t)) {
7✔
1485
    PHIBARRIER(fright);
5✔
1486
    if (fins->op1 == fright->op1) {  /* i - (i + j) ==> 0 - j */
4✔
1487
      fins->op2 = fright->op2;
×
1488
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
×
1489
      return RETRYFOLD;
×
1490
    }
1491
    if (fins->op1 == fright->op2) {  /* i - (j + i) ==> 0 - j */
4✔
1492
      fins->op2 = fright->op1;
×
1493
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
×
1494
      return RETRYFOLD;
×
1495
    }
1496
  }
1497
  return NEXTFOLD;
1498
}
1499

1500
LJFOLD(SUB ADD ADD)
1501
LJFOLDF(simplify_intsubaddadd_cancel)
3✔
1502
{
1503
  if (!irt_isnum(fins->t)) {
3✔
1504
    PHIBARRIER(fleft);
2✔
1505
    PHIBARRIER(fright);
1✔
1506
    if (fleft->op1 == fright->op1) {  /* (i + j1) - (i + j2) ==> j1 - j2 */
×
1507
      fins->op1 = fleft->op2;
×
1508
      fins->op2 = fright->op2;
×
1509
      return RETRYFOLD;
×
1510
    }
1511
    if (fleft->op1 == fright->op2) {  /* (i + j1) - (j2 + i) ==> j1 - j2 */
×
1512
      fins->op1 = fleft->op2;
×
1513
      fins->op2 = fright->op1;
×
1514
      return RETRYFOLD;
×
1515
    }
1516
    if (fleft->op2 == fright->op1) {  /* (j1 + i) - (i + j2) ==> j1 - j2 */
×
1517
      fins->op1 = fleft->op1;
×
1518
      fins->op2 = fright->op2;
×
1519
      return RETRYFOLD;
×
1520
    }
1521
    if (fleft->op2 == fright->op2) {  /* (j1 + i) - (j2 + i) ==> j1 - j2 */
×
1522
      fins->op1 = fleft->op1;
×
1523
      fins->op2 = fright->op1;
×
1524
      return RETRYFOLD;
×
1525
    }
1526
  }
1527
  return NEXTFOLD;
1528
}
1529

1530
LJFOLD(BAND any KINT)
1531
LJFOLD(BAND any KINT64)
1532
LJFOLDF(simplify_band_k)
36,042✔
1533
{
1534
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
36,042✔
1535
                                     (int64_t)ir_k64(fright)->u64;
20✔
1536
  if (k == 0)  /* i & 0 ==> 0 */
36,042✔
1537
    return RIGHTFOLD;
2✔
1538
  if (k == -1)  /* i & -1 ==> i */
36,040✔
1539
    return LEFTFOLD;
×
1540
  return NEXTFOLD;
1541
}
1542

1543
LJFOLD(BOR any KINT)
1544
LJFOLD(BOR any KINT64)
1545
LJFOLDF(simplify_bor_k)
58✔
1546
{
1547
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
58✔
1548
                                     (int64_t)ir_k64(fright)->u64;
6✔
1549
  if (k == 0)  /* i | 0 ==> i */
58✔
1550
    return LEFTFOLD;
×
1551
  if (k == -1)  /* i | -1 ==> -1 */
58✔
1552
    return RIGHTFOLD;
×
1553
  return NEXTFOLD;
1554
}
1555

1556
LJFOLD(BXOR any KINT)
1557
LJFOLD(BXOR any KINT64)
1558
LJFOLDF(simplify_bxor_k)
10✔
1559
{
1560
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
10✔
1561
                                     (int64_t)ir_k64(fright)->u64;
6✔
1562
  if (k == 0)  /* i xor 0 ==> i */
10✔
1563
    return LEFTFOLD;
4✔
1564
  if (k == -1) {  /* i xor -1 ==> ~i */
6✔
1565
    fins->o = IR_BNOT;
×
1566
    fins->op2 = 0;
×
1567
    return RETRYFOLD;
×
1568
  }
1569
  return NEXTFOLD;
1570
}
1571

1572
LJFOLD(BSHL any KINT)
1573
LJFOLD(BSHR any KINT)
1574
LJFOLD(BSAR any KINT)
1575
LJFOLD(BROL any KINT)
1576
LJFOLD(BROR any KINT)
1577
LJFOLDF(simplify_shift_ik)
6,194✔
1578
{
1579
  int32_t mask = irt_is64(fins->t) ? 63 : 31;
6,194✔
1580
  int32_t k = (fright->i & mask);
6,194✔
1581
  if (k == 0)  /* i o 0 ==> i */
6,194✔
1582
    return LEFTFOLD;
1✔
1583
  if (k == 1 && fins->o == IR_BSHL) {  /* i << 1 ==> i + i */
6,193✔
1584
    fins->o = IR_ADD;
18✔
1585
    fins->op2 = fins->op1;
18✔
1586
    return RETRYFOLD;
18✔
1587
  }
1588
  if (k != fright->i) {  /* i o k ==> i o (k & mask) */
6,175✔
1589
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
×
1590
    return RETRYFOLD;
×
1591
  }
1592
#ifndef LJ_TARGET_UNIFYROT
1593
  if (fins->o == IR_BROR) {  /* bror(i, k) ==> brol(i, (-k)&mask) */
6,175✔
1594
    fins->o = IR_BROL;
1✔
1595
    fins->op2 = (IRRef1)lj_ir_kint(J, (-k)&mask);
1✔
1596
    return RETRYFOLD;
1✔
1597
  }
1598
#endif
1599
  return NEXTFOLD;
1600
}
1601

1602
LJFOLD(BSHL any BAND)
1603
LJFOLD(BSHR any BAND)
1604
LJFOLD(BSAR any BAND)
1605
LJFOLD(BROL any BAND)
1606
LJFOLD(BROR any BAND)
1607
LJFOLDF(simplify_shift_andk)
×
1608
{
1609
  IRIns *irk = IR(fright->op2);
×
1610
  PHIBARRIER(fright);
×
1611
  if ((fins->o < IR_BROL ? LJ_TARGET_MASKSHIFT : LJ_TARGET_MASKROT) &&
×
1612
      irk->o == IR_KINT) {  /* i o (j & mask) ==> i o j */
×
1613
    int32_t mask = irt_is64(fins->t) ? 63 : 31;
×
1614
    int32_t k = irk->i & mask;
×
1615
    if (k == mask) {
×
1616
      fins->op2 = fright->op1;
×
1617
      return RETRYFOLD;
×
1618
    }
1619
  }
1620
  return NEXTFOLD;
1621
}
1622

1623
LJFOLD(BSHL KINT any)
1624
LJFOLD(BSHR KINT any)
1625
LJFOLD(BSHL KINT64 any)
1626
LJFOLD(BSHR KINT64 any)
1627
LJFOLDF(simplify_shift1_ki)
2✔
1628
{
1629
  int64_t k = fleft->o == IR_KINT ? (int64_t)fleft->i :
2✔
1630
                                    (int64_t)ir_k64(fleft)->u64;
2✔
1631
  if (k == 0)  /* 0 o i ==> 0 */
2✔
1632
    return LEFTFOLD;
×
1633
  return NEXTFOLD;
1634
}
1635

1636
LJFOLD(BSAR KINT any)
1637
LJFOLD(BROL KINT any)
1638
LJFOLD(BROR KINT any)
1639
LJFOLD(BSAR KINT64 any)
1640
LJFOLD(BROL KINT64 any)
1641
LJFOLD(BROR KINT64 any)
1642
LJFOLDF(simplify_shift2_ki)
2✔
1643
{
1644
  int64_t k = fleft->o == IR_KINT ? (int64_t)fleft->i :
2✔
1645
                                    (int64_t)ir_k64(fleft)->u64;
2✔
1646
  if (k == 0 || k == -1)  /* 0 o i ==> 0; -1 o i ==> -1 */
2✔
1647
    return LEFTFOLD;
×
1648
  return NEXTFOLD;
1649
}
1650

1651
LJFOLD(BSHL BAND KINT)
1652
LJFOLD(BSHR BAND KINT)
1653
LJFOLD(BROL BAND KINT)
1654
LJFOLD(BROR BAND KINT)
1655
LJFOLDF(simplify_shiftk_andk)
10✔
1656
{
1657
  IRIns *irk = IR(fleft->op2);
10✔
1658
  PHIBARRIER(fleft);
10✔
1659
  if (irk->o == IR_KINT) {  /* (i & k1) o k2 ==> (i o k2) & (k1 o k2) */
10✔
1660
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
5✔
1661
    fins->op1 = fleft->op1;
5✔
1662
    fins->op1 = (IRRef1)lj_opt_fold(J);
5✔
1663
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
5✔
1664
    fins->ot = IRTI(IR_BAND);
5✔
1665
    return RETRYFOLD;
5✔
1666
  } else if (irk->o == IR_KINT64) {
5✔
1667
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, fright->i,
5✔
1668
                                  (IROp)fins->o);
5✔
1669
    IROpT ot = fleft->ot;
5✔
1670
    fins->op1 = fleft->op1;
5✔
1671
    fins->op1 = (IRRef1)lj_opt_fold(J);
5✔
1672
    fins->op2 = (IRRef1)lj_ir_kint64(J, k);
5✔
1673
    fins->ot = ot;
5✔
1674
    return RETRYFOLD;
5✔
1675
  }
1676
  return NEXTFOLD;
1677
}
1678

1679
LJFOLD(BAND BSHL KINT)
1680
LJFOLD(BAND BSHR KINT)
1681
LJFOLDF(simplify_andk_shiftk)
14✔
1682
{
1683
  IRIns *irk = IR(fleft->op2);
14✔
1684
  if (irk->o == IR_KINT &&
14✔
1685
      kfold_intop(-1, irk->i, (IROp)fleft->o) == fright->i)
14✔
1686
    return LEFTFOLD;  /* (i o k1) & k2 ==> i, if (-1 o k1) == k2 */
2✔
1687
  return NEXTFOLD;
1688
}
1689

1690
LJFOLD(BAND BOR KINT)
1691
LJFOLD(BOR BAND KINT)
1692
LJFOLDF(simplify_andor_k)
×
1693
{
1694
  IRIns *irk = IR(fleft->op2);
×
1695
  PHIBARRIER(fleft);
×
1696
  if (irk->o == IR_KINT) {
×
1697
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
×
1698
    /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
1699
    /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
1700
    if (k == (fins->o == IR_BAND ? 0 : -1)) {
×
1701
      fins->op1 = fleft->op1;
×
1702
      return RETRYFOLD;
×
1703
    }
1704
  }
1705
  return NEXTFOLD;
1706
}
1707

1708
LJFOLD(BAND BOR KINT64)
1709
LJFOLD(BOR BAND KINT64)
1710
LJFOLDF(simplify_andor_k64)
×
1711
{
1712
#if LJ_HASFFI
1713
  IRIns *irk = IR(fleft->op2);
×
1714
  PHIBARRIER(fleft);
×
1715
  if (irk->o == IR_KINT64) {
×
1716
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
×
1717
                                  (IROp)fins->o);
×
1718
    /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
1719
    /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
1720
    if (k == (fins->o == IR_BAND ? (uint64_t)0 : ~(uint64_t)0)) {
×
1721
      fins->op1 = fleft->op1;
×
1722
      return RETRYFOLD;
×
1723
    }
1724
  }
1725
  return NEXTFOLD;
1726
#else
1727
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1728
#endif
1729
}
1730

1731
/* -- Reassociation ------------------------------------------------------- */
1732

1733
LJFOLD(ADD ADD KINT)
1734
LJFOLD(MUL MUL KINT)
1735
LJFOLD(BAND BAND KINT)
1736
LJFOLD(BOR BOR KINT)
1737
LJFOLD(BXOR BXOR KINT)
1738
LJFOLDF(reassoc_intarith_k)
3,734✔
1739
{
1740
  IRIns *irk = IR(fleft->op2);
3,734✔
1741
  if (irk->o == IR_KINT) {
3,734✔
1742
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
3,710✔
1743
    if (k == irk->i)  /* (i o k1) o k2 ==> i o k1, if (k1 o k2) == k1. */
3,710✔
1744
      return LEFTFOLD;
3✔
1745
    PHIBARRIER(fleft);
3,707✔
1746
    fins->op1 = fleft->op1;
73✔
1747
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
73✔
1748
    return RETRYFOLD;  /* (i o k1) o k2 ==> i o (k1 o k2) */
73✔
1749
  }
1750
  return NEXTFOLD;
1751
}
1752

1753
LJFOLD(ADD ADD KINT64)
1754
LJFOLD(MUL MUL KINT64)
1755
LJFOLD(BAND BAND KINT64)
1756
LJFOLD(BOR BOR KINT64)
1757
LJFOLD(BXOR BXOR KINT64)
1758
LJFOLDF(reassoc_intarith_k64)
27,415✔
1759
{
1760
#if LJ_HASFFI
1761
  IRIns *irk = IR(fleft->op2);
27,415✔
1762
  if (irk->o == IR_KINT64) {
27,415✔
1763
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
9,058✔
1764
                                  (IROp)fins->o);
9,058✔
1765
    PHIBARRIER(fleft);
9,058✔
1766
    fins->op1 = fleft->op1;
9,049✔
1767
    fins->op2 = (IRRef1)lj_ir_kint64(J, k);
9,049✔
1768
    return RETRYFOLD;  /* (i o k1) o k2 ==> i o (k1 o k2) */
9,049✔
1769
  }
1770
  return NEXTFOLD;
1771
#else
1772
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1773
#endif
1774
}
1775

1776
LJFOLD(BAND BAND any)
1777
LJFOLD(BOR BOR any)
1778
LJFOLDF(reassoc_dup)
1✔
1779
{
1780
  if (fins->op2 == fleft->op1 || fins->op2 == fleft->op2)
1✔
1781
    return LEFTFOLD;  /* (a o b) o a ==> a o b; (a o b) o b ==> a o b */
×
1782
  return NEXTFOLD;
1783
}
1784

1785
LJFOLD(MIN MIN any)
1786
LJFOLD(MAX MAX any)
1787
LJFOLDF(reassoc_dup_minmax)
8✔
1788
{
1789
  if (fins->op2 == fleft->op2)
8✔
1790
    return LEFTFOLD;  /* (a o b) o b ==> a o b */
2✔
1791
  return NEXTFOLD;
1792
}
1793

1794
LJFOLD(BXOR BXOR any)
1795
LJFOLDF(reassoc_bxor)
1✔
1796
{
1797
  PHIBARRIER(fleft);
1✔
1798
  if (fins->op2 == fleft->op1)  /* (a xor b) xor a ==> b */
1✔
1799
    return fleft->op2;
×
1800
  if (fins->op2 == fleft->op2)  /* (a xor b) xor b ==> a */
1✔
1801
    return fleft->op1;
×
1802
  return NEXTFOLD;
1803
}
1804

1805
LJFOLD(BSHL BSHL KINT)
1806
LJFOLD(BSHR BSHR KINT)
1807
LJFOLD(BSAR BSAR KINT)
1808
LJFOLD(BROL BROL KINT)
1809
LJFOLD(BROR BROR KINT)
1810
LJFOLDF(reassoc_shift)
2✔
1811
{
1812
  IRIns *irk = IR(fleft->op2);
2✔
1813
  PHIBARRIER(fleft);  /* The (shift any KINT) rule covers k2 == 0 and more. */
2✔
1814
  if (irk->o == IR_KINT) {  /* (i o k1) o k2 ==> i o (k1 + k2) */
1✔
1815
    int32_t mask = irt_is64(fins->t) ? 63 : 31;
1✔
1816
    int32_t k = (irk->i & mask) + (fright->i & mask);
1✔
1817
    if (k > mask) {  /* Combined shift too wide? */
1✔
1818
      if (fins->o == IR_BSHL || fins->o == IR_BSHR)
×
1819
        return mask == 31 ? INTFOLD(0) : INT64FOLD(0);
×
1820
      else if (fins->o == IR_BSAR)
×
1821
        k = mask;
1822
      else
1823
        k &= mask;
×
1824
    }
1825
    fins->op1 = fleft->op1;
1✔
1826
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
1✔
1827
    return RETRYFOLD;
1✔
1828
  }
1829
  return NEXTFOLD;
1830
}
1831

1832
LJFOLD(MIN MIN KINT)
1833
LJFOLD(MAX MAX KINT)
1834
LJFOLDF(reassoc_minmax_k)
×
1835
{
1836
  IRIns *irk = IR(fleft->op2);
×
1837
  if (irk->o == IR_KINT) {
×
1838
    int32_t a = irk->i;
×
1839
    int32_t y = kfold_intop(a, fright->i, fins->o);
×
1840
    if (a == y)  /* (x o k1) o k2 ==> x o k1, if (k1 o k2) == k1. */
×
1841
      return LEFTFOLD;
×
1842
    PHIBARRIER(fleft);
×
1843
    fins->op1 = fleft->op1;
×
1844
    fins->op2 = (IRRef1)lj_ir_kint(J, y);
×
1845
    return RETRYFOLD;  /* (x o k1) o k2 ==> x o (k1 o k2) */
×
1846
  }
1847
  return NEXTFOLD;
1848
}
1849

1850
/* -- Array bounds check elimination -------------------------------------- */
1851

1852
/* Eliminate ABC across PHIs to handle t[i-1] forwarding case.
1853
** ABC(asize, (i+k)+(-k)) ==> ABC(asize, i), but only if it already exists.
1854
** Could be generalized to (i+k1)+k2 ==> i+(k1+k2), but needs better disambig.
1855
*/
1856
LJFOLD(ABC any ADD)
1857
LJFOLDF(abc_fwd)
679✔
1858
{
1859
  if (LJ_LIKELY(J->flags & JIT_F_OPT_ABC)) {
679✔
1860
    if (irref_isk(fright->op2)) {
679✔
1861
      IRIns *add2 = IR(fright->op1);
678✔
1862
      if (add2->o == IR_ADD && irref_isk(add2->op2) &&
678✔
1863
          IR(fright->op2)->i == -IR(add2->op2)->i) {
32✔
1864
        IRRef ref = J->chain[IR_ABC];
6✔
1865
        IRRef lim = add2->op1;
6✔
1866
        if (fins->op1 > lim) lim = fins->op1;
6✔
1867
        while (ref > lim) {
15✔
1868
          IRIns *ir = IR(ref);
14✔
1869
          if (ir->op1 == fins->op1 && ir->op2 == add2->op1)
14✔
1870
            return DROPFOLD;
1871
          ref = ir->prev;
9✔
1872
        }
1873
      }
1874
    }
1875
  }
1876
  return NEXTFOLD;
1877
}
1878

1879
/* Eliminate ABC for constants.
1880
** ABC(asize, k1), ABC(asize k2) ==> ABC(asize, max(k1, k2))
1881
** Drop second ABC if k2 is lower. Otherwise patch first ABC with k2.
1882
*/
1883
LJFOLD(ABC any KINT)
1884
LJFOLDF(abc_k)
1,038✔
1885
{
1886
  PHIBARRIER(fleft);
1,038✔
1887
  if (LJ_LIKELY(J->flags & JIT_F_OPT_ABC)) {
1,035✔
1888
    IRRef ref = J->chain[IR_ABC];
1,035✔
1889
    IRRef asize = fins->op1;
1,035✔
1890
    while (ref > asize) {
1,165✔
1891
      IRIns *ir = IR(ref);
870✔
1892
      if (ir->op1 == asize && irref_isk(ir->op2)) {
870✔
1893
        uint32_t k = (uint32_t)IR(ir->op2)->i;
740✔
1894
        if ((uint32_t)fright->i > k)
740✔
1895
          ir->op2 = fins->op2;
40✔
1896
        return DROPFOLD;
740✔
1897
      }
1898
      ref = ir->prev;
130✔
1899
    }
1900
    return EMITFOLD;  /* Already performed CSE. */
295✔
1901
  }
1902
  return NEXTFOLD;
1903
}
1904

1905
/* Eliminate invariant ABC inside loop. */
1906
LJFOLD(ABC any any)
1907
LJFOLDF(abc_invar)
4,518✔
1908
{
1909
  /* Invariant ABC marked as P32 or U32. Drop if op1 is invariant too. */
1910
  if (!irt_isint(fins->t) && fins->op1 < J->chain[IR_LOOP] &&
4,518✔
1911
      (irt_isu32(fins->t) ||
8✔
1912
       (!irref_isk(fins->op1) && !irt_isphi(IR(fins->op1)->t))))
8✔
1913
    return DROPFOLD;
7✔
1914
  return NEXTFOLD;
1915
}
1916

1917
/* -- Commutativity ------------------------------------------------------- */
1918

1919
/* The refs of commutative ops are canonicalized. Lower refs go to the right.
1920
** Rationale behind this:
1921
** - It (also) moves constants to the right.
1922
** - It reduces the number of FOLD rules (e.g. (BOR any KINT) suffices).
1923
** - It helps CSE to find more matches.
1924
** - The assembler generates better code with constants at the right.
1925
*/
1926

1927
LJFOLD(ADD any any)
1928
LJFOLD(MUL any any)
1929
LJFOLD(ADDOV any any)
1930
LJFOLD(MULOV any any)
1931
LJFOLDF(comm_swap)
1,408,763✔
1932
{
1933
  if (fins->op1 < fins->op2) {  /* Move lower ref to the right. */
236,774✔
1934
    IRRef1 tmp = fins->op1;
78,551✔
1935
    fins->op1 = fins->op2;
78,551✔
1936
    fins->op2 = tmp;
78,551✔
1937
    return RETRYFOLD;
78,551✔
1938
  }
1939
  return NEXTFOLD;
1940
}
1941

1942
LJFOLD(EQ any any)
1943
LJFOLD(NE any any)
1944
LJFOLDF(comm_equal)
1,135,968✔
1945
{
1946
  /* For non-numbers only: x == x ==> drop; x ~= x ==> fail */
1947
  if (fins->op1 == fins->op2 &&
1,135,968✔
1948
      (!irt_isnum(fins->t) ||
296✔
1949
       (fleft->o == IR_CONV &&  /* Converted integers cannot be NaN. */
104✔
1950
        (uint32_t)(fleft->op2 & IRCONV_SRCMASK) - (uint32_t)IRT_I8 <= (uint32_t)(IRT_U64 - IRT_U8))))
11✔
1951
    return CONDFOLD(fins->o == IR_EQ);
200✔
1952
  return fold_comm_swap(J);
1,135,768✔
1953
}
1954

1955
LJFOLD(LT any any)
1956
LJFOLD(GE any any)
1957
LJFOLD(LE any any)
1958
LJFOLD(GT any any)
1959
LJFOLD(ULT any any)
1960
LJFOLD(UGE any any)
1961
LJFOLD(ULE any any)
1962
LJFOLD(UGT any any)
1963
LJFOLDF(comm_comp)
258,559✔
1964
{
1965
  /* For non-numbers only: x <=> x ==> drop; x <> x ==> fail */
1966
  if (fins->op1 == fins->op2 && !irt_isnum(fins->t))
258,559✔
1967
    return CONDFOLD((fins->o ^ (fins->o >> 1)) & 1);
×
1968
  if (fins->op1 < fins->op2) {  /* Move lower ref to the right. */
258,559✔
1969
    IRRef1 tmp = fins->op1;
84,683✔
1970
    fins->op1 = fins->op2;
84,683✔
1971
    fins->op2 = tmp;
84,683✔
1972
    fins->o ^= 3; /* GT <-> LT, GE <-> LE, does not affect U */
84,683✔
1973
    return RETRYFOLD;
84,683✔
1974
  }
1975
  return NEXTFOLD;
1976
}
1977

1978
LJFOLD(BAND any any)
1979
LJFOLD(BOR any any)
1980
LJFOLDF(comm_dup)
36,175✔
1981
{
1982
  if (fins->op1 == fins->op2)  /* x o x ==> x */
36,175✔
1983
    return LEFTFOLD;
×
1984
  return fold_comm_swap(J);
36,175✔
1985
}
1986

1987
LJFOLD(MIN any any)
1988
LJFOLD(MAX any any)
1989
LJFOLDF(comm_dup_minmax)
66,333✔
1990
{
1991
  if (fins->op1 == fins->op2)  /* x o x ==> x */
66,333✔
1992
    return LEFTFOLD;
×
1993
  return NEXTFOLD;
1994
}
1995

1996
LJFOLD(BXOR any any)
1997
LJFOLDF(comm_bxor)
48✔
1998
{
1999
  if (fins->op1 == fins->op2)  /* i xor i ==> 0 */
48✔
2000
    return irt_is64(fins->t) ? INT64FOLD(0) : INTFOLD(0);
2✔
2001
  return fold_comm_swap(J);
46✔
2002
}
2003

2004
/* -- Simplification of compound expressions ------------------------------ */
2005

2006
static TRef kfold_xload(jit_State *J, IRIns *ir, const void *p)
2007
{
2008
  int32_t k;
2009
  switch (irt_type(ir->t)) {
2010
  case IRT_NUM: return lj_ir_knum_u64(J, *(uint64_t *)p);
2011
  case IRT_I8: k = (int32_t)*(int8_t *)p; break;
2012
  case IRT_U8: k = (int32_t)*(uint8_t *)p; break;
2013
  case IRT_I16: k = (int32_t)(int16_t)lj_getu16(p); break;
2014
  case IRT_U16: k = (int32_t)(uint16_t)lj_getu16(p); break;
2015
  case IRT_INT: case IRT_U32: k = (int32_t)lj_getu32(p); break;
2016
  case IRT_I64: case IRT_U64: return lj_ir_kint64(J, *(uint64_t *)p);
2017
  default: return 0;
2018
  }
2019
  return lj_ir_kint(J, k);
2020
}
2021

2022
/* Turn: string.sub(str, a, b) == kstr
2023
** into: string.byte(str, a) == string.byte(kstr, 1) etc.
2024
** Note: this creates unaligned XLOADs on x86/x64.
2025
*/
2026
LJFOLD(EQ SNEW KGC)
2027
LJFOLD(NE SNEW KGC)
2028
LJFOLDF(merge_eqne_snew_kgc)
114✔
2029
{
2030
  GCstr *kstr = ir_kstr(fright);
114✔
2031
  int32_t len = (int32_t)kstr->len;
114✔
2032
  lj_assertJ(irt_isstr(fins->t), "bad equality IR type");
114✔
2033

2034
#if LJ_TARGET_UNALIGNED
2035
#define FOLD_SNEW_MAX_LEN        4  /* Handle string lengths 0, 1, 2, 3, 4. */
2036
#define FOLD_SNEW_TYPE8                IRT_I8        /* Creates shorter immediates. */
2037
#else
2038
#define FOLD_SNEW_MAX_LEN        1  /* Handle string lengths 0 or 1. */
2039
#define FOLD_SNEW_TYPE8                IRT_U8  /* Prefer unsigned loads. */
2040
#endif
2041

2042
  PHIBARRIER(fleft);
114✔
2043
  if (len <= FOLD_SNEW_MAX_LEN) {
110✔
2044
    IROp op = (IROp)fins->o;
110✔
2045
    IRRef strref = fleft->op1;
110✔
2046
    if (IR(strref)->o != IR_STRREF)
110✔
2047
      return NEXTFOLD;
2048
    if (op == IR_EQ) {
110✔
2049
      emitir(IRTGI(IR_EQ), fleft->op2, lj_ir_kint(J, len));
19✔
2050
      /* Caveat: fins/fleft/fright is no longer valid after emitir. */
2051
    } else {
2052
      /* NE is not expanded since this would need an OR of two conds. */
2053
      if (!irref_isk(fleft->op2))  /* Only handle the constant length case. */
91✔
2054
        return NEXTFOLD;
2055
      if (IR(fleft->op2)->i != len)
14✔
2056
        return DROPFOLD;
2057
    }
2058
    if (len > 0) {
26✔
2059
      /* A 4 byte load for length 3 is ok -- all strings have an extra NUL. */
2060
      uint16_t ot = (uint16_t)(len == 1 ? IRT(IR_XLOAD, FOLD_SNEW_TYPE8) :
25✔
2061
                               len == 2 ? IRT(IR_XLOAD, IRT_U16) :
2062
                               IRTI(IR_XLOAD));
2063
      TRef tmp = emitir(ot, strref,
47✔
2064
                        IRXLOAD_READONLY | (len > 1 ? IRXLOAD_UNALIGNED : 0));
2065
      TRef val = kfold_xload(J, IR(tref_ref(tmp)), strdata(kstr));
25✔
2066
      if (len == 3)
25✔
2067
        tmp = emitir(IRTI(IR_BAND), tmp,
1✔
2068
                     lj_ir_kint(J, LJ_ENDIAN_SELECT(0x00ffffff, 0xffffff00)));
2069
      fins->op1 = (IRRef1)tmp;
25✔
2070
      fins->op2 = (IRRef1)val;
25✔
2071
      fins->ot = (IROpT)IRTGI(op);
25✔
2072
      return RETRYFOLD;
25✔
2073
    } else {
2074
      return DROPFOLD;
2075
    }
2076
  }
2077
  return NEXTFOLD;
2078
}
2079

2080
/* -- Loads --------------------------------------------------------------- */
2081

2082
/* Loads cannot be folded or passed on to CSE in general.
2083
** Alias analysis is needed to check for forwarding opportunities.
2084
**
2085
** Caveat: *all* loads must be listed here or they end up at CSE!
2086
*/
2087

2088
LJFOLD(ALOAD any)
2089
LJFOLDX(lj_opt_fwd_aload)
2090

2091
/* From HREF fwd (see below). Must eliminate, not supported by fwd/backend. */
2092
LJFOLD(HLOAD KKPTR)
2093
LJFOLDF(kfold_hload_kkptr)
14✔
2094
{
2095
  UNUSED(J);
14✔
2096
  lj_assertJ(ir_kptr(fleft) == niltvg(J2G(J)), "expected niltv");
14✔
2097
  return TREF_NIL;
14✔
2098
}
2099

2100
LJFOLD(HLOAD any)
2101
LJFOLDX(lj_opt_fwd_hload)
2102

2103
LJFOLD(ULOAD any)
2104
LJFOLDX(lj_opt_fwd_uload)
2105

2106
LJFOLD(CALLL any IRCALL_lj_tab_len)
2107
LJFOLDX(lj_opt_fwd_tab_len)
2108

2109
/* Upvalue refs are really loads, but there are no corresponding stores.
2110
** So CSE is ok for them, except for UREFO across a GC step (see below).
2111
** If the referenced function is const, its upvalue addresses are const, too.
2112
** This can be used to improve CSE by looking for the same address,
2113
** even if the upvalues originate from a different function.
2114
*/
2115
LJFOLD(UREFO KGC any)
2116
LJFOLD(UREFC KGC any)
2117
LJFOLDF(cse_uref)
53,024✔
2118
{
2119
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
53,024✔
2120
    IRRef ref = J->chain[fins->o];
53,024✔
2121
    GCfunc *fn = ir_kfunc(fleft);
53,024✔
2122
    GCupval *uv = gco2uv(gcref(fn->l.uvptr[(fins->op2 >> 8)]));
53,024✔
2123
    while (ref > 0) {
91,239✔
2124
      IRIns *ir = IR(ref);
65,672✔
2125
      if (irref_isk(ir->op1)) {
65,672✔
2126
        GCfunc *fn2 = ir_kfunc(IR(ir->op1));
65,132✔
2127
        if (gco2uv(gcref(fn2->l.uvptr[(ir->op2 >> 8)])) == uv) {
65,132✔
2128
          if (fins->o == IR_UREFO && gcstep_barrier(J, ref))
27,457✔
2129
            break;
2130
          return ref;
2131
        }
2132
      }
2133
      ref = ir->prev;
38,215✔
2134
    }
2135
  }
2136
  return EMITFOLD;
25,582✔
2137
}
2138

2139
LJFOLD(HREFK any any)
2140
LJFOLDX(lj_opt_fwd_hrefk)
2141

2142
LJFOLD(HREF TNEW any)
2143
LJFOLDF(fwd_href_tnew)
31✔
2144
{
2145
  if (lj_opt_fwd_href_nokey(J))
31✔
2146
    return lj_ir_kkptr(J, niltvg(J2G(J)));
26✔
2147
  return NEXTFOLD;
2148
}
2149

2150
LJFOLD(HREF TDUP KPRI)
2151
LJFOLD(HREF TDUP KGC)
2152
LJFOLD(HREF TDUP KNUM)
2153
LJFOLDF(fwd_href_tdup)
157✔
2154
{
2155
  TValue keyv;
157✔
2156
  lj_ir_kvalue(J->L, &keyv, fright);
157✔
2157
  if (lj_tab_get(J->L, ir_ktab(IR(fleft->op1)), &keyv) == niltvg(J2G(J)) &&
314✔
2158
      lj_opt_fwd_href_nokey(J))
157✔
2159
    return lj_ir_kkptr(J, niltvg(J2G(J)));
97✔
2160
  return NEXTFOLD;
2161
}
2162

2163
/* We can safely FOLD/CSE array/hash refs and field loads, since there
2164
** are no corresponding stores. But we need to check for any NEWREF with
2165
** an aliased table, as it may invalidate all of the pointers and fields.
2166
** Only HREF needs the NEWREF check -- AREF and HREFK already depend on
2167
** FLOADs. And NEWREF itself is treated like a store (see below).
2168
** LREF is constant (per trace) since coroutine switches are not inlined.
2169
*/
2170
LJFOLD(FLOAD TNEW IRFL_TAB_ASIZE)
2171
LJFOLDF(fload_tab_tnew_asize)
960✔
2172
{
2173
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
960✔
2174
    return INTFOLD(fleft->op1);
333✔
2175
  return NEXTFOLD;
2176
}
2177

2178
LJFOLD(FLOAD TNEW IRFL_TAB_HMASK)
2179
LJFOLDF(fload_tab_tnew_hmask)
109✔
2180
{
2181
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
109✔
2182
    return INTFOLD((1 << fleft->op2)-1);
86✔
2183
  return NEXTFOLD;
2184
}
2185

2186
LJFOLD(FLOAD TDUP IRFL_TAB_ASIZE)
2187
LJFOLDF(fload_tab_tdup_asize)
71✔
2188
{
2189
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
71✔
2190
    return INTFOLD((int32_t)ir_ktab(IR(fleft->op1))->asize);
54✔
2191
  return NEXTFOLD;
2192
}
2193

2194
LJFOLD(FLOAD TDUP IRFL_TAB_HMASK)
2195
LJFOLDF(fload_tab_tdup_hmask)
1,225✔
2196
{
2197
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
1,225✔
2198
    return INTFOLD((int32_t)ir_ktab(IR(fleft->op1))->hmask);
998✔
2199
  return NEXTFOLD;
2200
}
2201

2202
LJFOLD(HREF any any)
2203
LJFOLD(FLOAD any IRFL_TAB_ARRAY)
2204
LJFOLD(FLOAD any IRFL_TAB_NODE)
2205
LJFOLD(FLOAD any IRFL_TAB_ASIZE)
2206
LJFOLD(FLOAD any IRFL_TAB_HMASK)
2207
LJFOLDF(fload_tab_ah)
1,825,292✔
2208
{
2209
  TRef tr = lj_opt_cse(J);
1,825,292✔
2210
  return lj_opt_fwd_tptr(J, tref_ref(tr)) ? tr : EMITFOLD;
1,825,292✔
2211
}
2212

2213
/* Strings are immutable, so we can safely FOLD/CSE the related FLOAD. */
2214
LJFOLD(FLOAD KGC IRFL_STR_LEN)
2215
LJFOLDF(fload_str_len_kgc)
124✔
2216
{
2217
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
124✔
2218
    return INTFOLD((int32_t)ir_kstr(fleft)->len);
124✔
2219
  return NEXTFOLD;
2220
}
2221

2222
LJFOLD(FLOAD SNEW IRFL_STR_LEN)
2223
LJFOLDF(fload_str_len_snew)
3✔
2224
{
2225
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
3✔
2226
    PHIBARRIER(fleft);
3✔
2227
    return fleft->op2;
3✔
2228
  }
2229
  return NEXTFOLD;
2230
}
2231

2232
LJFOLD(FLOAD TOSTR IRFL_STR_LEN)
2233
LJFOLDF(fload_str_len_tostr)
×
2234
{
2235
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && fleft->op2 == IRTOSTR_CHAR)
×
2236
    return INTFOLD(1);
×
2237
  return NEXTFOLD;
2238
}
2239

2240
/* The C type ID of cdata objects is immutable. */
2241
LJFOLD(FLOAD KGC IRFL_CDATA_CTYPEID)
2242
LJFOLDF(fload_cdata_typeid_kgc)
7,569✔
2243
{
2244
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
7,569✔
2245
    return INTFOLD((int32_t)ir_kcdata(fleft)->ctypeid);
7,569✔
2246
  return NEXTFOLD;
2247
}
2248

2249
/* Get the contents of immutable cdata objects. */
2250
LJFOLD(FLOAD KGC IRFL_CDATA_PTR)
2251
LJFOLD(FLOAD KGC IRFL_CDATA_INT)
2252
LJFOLD(FLOAD KGC IRFL_CDATA_INT64)
2253
LJFOLDF(fload_cdata_int64_kgc)
7,517✔
2254
{
2255
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
7,517✔
2256
    void *p = cdataptr(ir_kcdata(fleft));
7,517✔
2257
    if (irt_is64(fins->t))
7,517✔
2258
      return INT64FOLD(*(uint64_t *)p);
7,483✔
2259
    else
2260
      return INTFOLD(*(int32_t *)p);
34✔
2261
  }
2262
  return NEXTFOLD;
2263
}
2264

2265
LJFOLD(FLOAD CNEW IRFL_CDATA_CTYPEID)
2266
LJFOLD(FLOAD CNEWI IRFL_CDATA_CTYPEID)
2267
LJFOLDF(fload_cdata_typeid_cnew)
50,318✔
2268
{
2269
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
50,318✔
2270
    return fleft->op1;  /* No PHI barrier needed. CNEW/CNEWI op1 is const. */
50,318✔
2271
  return NEXTFOLD;
2272
}
2273

2274
/* Pointer, int and int64 cdata objects are immutable. */
2275
LJFOLD(FLOAD CNEWI IRFL_CDATA_PTR)
2276
LJFOLD(FLOAD CNEWI IRFL_CDATA_INT)
2277
LJFOLD(FLOAD CNEWI IRFL_CDATA_INT64)
2278
LJFOLDF(fload_cdata_ptr_int64_cnew)
50,117✔
2279
{
2280
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
50,117✔
2281
    return fleft->op2;  /* Fold even across PHI to avoid allocations. */
50,117✔
2282
  return NEXTFOLD;
2283
}
2284

2285
LJFOLD(FLOAD any IRFL_STR_LEN)
2286
LJFOLD(FLOAD any IRFL_FUNC_ENV)
2287
LJFOLD(FLOAD any IRFL_THREAD_ENV)
2288
LJFOLD(FLOAD any IRFL_CDATA_CTYPEID)
2289
LJFOLD(FLOAD any IRFL_CDATA_PTR)
2290
LJFOLD(FLOAD any IRFL_CDATA_INT)
2291
LJFOLD(FLOAD any IRFL_CDATA_INT64)
2292
LJFOLD(VLOAD any any)  /* Vararg loads have no corresponding stores. */
2293
LJFOLDX(lj_opt_cse)
2294

2295
/* All other field loads need alias analysis. */
2296
LJFOLD(FLOAD any any)
2297
LJFOLDX(lj_opt_fwd_fload)
2298

2299
/* This is for LOOP only. Recording handles SLOADs internally. */
2300
LJFOLD(SLOAD any any)
2301
LJFOLDF(fwd_sload)
9,668✔
2302
{
2303
  if ((fins->op2 & IRSLOAD_FRAME)) {
9,668✔
2304
    TRef tr = lj_opt_cse(J);
42✔
2305
    return tref_ref(tr) < J->chain[IR_RETF] ? EMITFOLD : tr;
42✔
2306
  } else {
2307
    lj_assertJ(J->slot[fins->op1] != 0, "uninitialized slot accessed");
9,626✔
2308
    return J->slot[fins->op1];
9,626✔
2309
  }
2310
}
2311

2312
/* Only fold for KKPTR. The pointer _and_ the contents must be const. */
2313
LJFOLD(XLOAD KKPTR any)
2314
LJFOLDF(xload_kptr)
46✔
2315
{
2316
  TRef tr = kfold_xload(J, fins, ir_kptr(fleft));
46✔
2317
  return tr ? tr : NEXTFOLD;
46✔
2318
}
2319

2320
LJFOLD(XLOAD any any)
2321
LJFOLDX(lj_opt_fwd_xload)
2322

2323
/* -- Frame handling ------------------------------------------------------ */
2324

2325
/* Prevent CSE of a REF_BASE operand across IR_RETF. */
2326
LJFOLD(SUB any BASE)
2327
LJFOLD(SUB BASE any)
2328
LJFOLD(EQ any BASE)
2329
LJFOLDF(fold_base)
620✔
2330
{
2331
  return lj_opt_cselim(J, J->chain[IR_RETF]);
620✔
2332
}
2333

2334
/* -- Write barriers ------------------------------------------------------ */
2335

2336
/* Write barriers are amenable to CSE, but not across any incremental
2337
** GC steps.
2338
**
2339
** The same logic applies to open upvalue references, because a stack
2340
** may be resized during a GC step (not the current stack, but maybe that
2341
** of a coroutine).
2342
*/
2343
LJFOLD(TBAR any)
2344
LJFOLD(OBAR any any)
2345
LJFOLD(UREFO any any)
2346
LJFOLDF(barrier_tab)
142,992✔
2347
{
2348
  TRef tr = lj_opt_cse(J);
142,992✔
2349
  if (gcstep_barrier(J, tref_ref(tr)))  /* CSE across GC step? */
142,992✔
2350
    return EMITFOLD;  /* Raw emit. Assumes fins is left intact by CSE. */
66✔
2351
  return tr;
2352
}
2353

2354
LJFOLD(TBAR TNEW)
2355
LJFOLD(TBAR TDUP)
2356
LJFOLDF(barrier_tnew_tdup)
937✔
2357
{
2358
  /* New tables are always white and never need a barrier. */
2359
  if (fins->op1 < J->chain[IR_LOOP])  /* Except across a GC step. */
937✔
2360
    return NEXTFOLD;
×
2361
  return DROPFOLD;
2362
}
2363

2364
/* -- Profiling ----------------------------------------------------------- */
2365

2366
LJFOLD(PROF any any)
2367
LJFOLDF(prof)
1✔
2368
{
2369
  IRRef ref = J->chain[IR_PROF];
1✔
2370
  if (ref+1 == J->cur.nins)  /* Drop neighbouring IR_PROF. */
1✔
2371
    return ref;
2372
  return EMITFOLD;
1✔
2373
}
2374

2375
/* -- Stores and allocations ---------------------------------------------- */
2376

2377
/* Stores and allocations cannot be folded or passed on to CSE in general.
2378
** But some stores can be eliminated with dead-store elimination (DSE).
2379
**
2380
** Caveat: *all* stores and allocs must be listed here or they end up at CSE!
2381
*/
2382

2383
LJFOLD(ASTORE any any)
2384
LJFOLD(HSTORE any any)
2385
LJFOLDX(lj_opt_dse_ahstore)
2386

2387
LJFOLD(USTORE any any)
2388
LJFOLDX(lj_opt_dse_ustore)
2389

2390
LJFOLD(FSTORE any any)
2391
LJFOLDX(lj_opt_dse_fstore)
2392

2393
LJFOLD(XSTORE any any)
2394
LJFOLDX(lj_opt_dse_xstore)
2395

2396
LJFOLD(NEWREF any any)  /* Treated like a store. */
2397
LJFOLD(CALLA any any)
2398
LJFOLD(CALLL any any)  /* Safeguard fallback. */
2399
LJFOLD(CALLS any any)
2400
LJFOLD(CALLXS any any)
2401
LJFOLD(XBAR)
2402
LJFOLD(RETF any any)  /* Modifies BASE. */
2403
LJFOLD(TNEW any any)
2404
LJFOLD(TDUP any)
2405
LJFOLD(CNEW any any)
2406
LJFOLD(XSNEW any any)
2407
LJFOLD(BUFHDR any any)
2408
LJFOLDX(lj_ir_emit)
2409

2410
/* -- Miscellaneous ------------------------------------------------------- */
2411

2412
LJFOLD(CARG any any)
2413
LJFOLDF(cse_carg)
1,666✔
2414
{
2415
  TRef tr = lj_opt_cse(J);
1,666✔
2416
  if (tref_ref(tr) < J->chain[IR_LOOP])  /* CSE across loop? */
1,666✔
2417
    return EMITFOLD;  /* Raw emit. Assumes fins is left intact by CSE. */
1✔
2418
  return tr;
2419
}
2420

2421
/* ------------------------------------------------------------------------ */
2422

2423
/* Every entry in the generated hash table is a 32 bit pattern:
2424
**
2425
** xxxxxxxx iiiiiii lllllll rrrrrrrrrr
2426
**
2427
**   xxxxxxxx = 8 bit index into fold function table
2428
**    iiiiiii = 7 bit folded instruction opcode
2429
**    lllllll = 7 bit left instruction opcode
2430
** rrrrrrrrrr = 8 bit right instruction opcode or 10 bits from literal field
2431
*/
2432

2433
#include "lj_folddef.h"
2434

2435
/* ------------------------------------------------------------------------ */
2436

2437
/* Fold IR instruction. */
2438
TRef LJ_FASTCALL lj_opt_fold(jit_State *J)
8,081,886✔
2439
{
2440
  uint32_t key, any;
8,081,886✔
2441
  IRRef ref;
8,081,886✔
2442

2443
  if (LJ_UNLIKELY((J->flags & JIT_F_OPT_MASK) != JIT_F_OPT_DEFAULT)) {
8,081,886✔
2444
    lj_assertJ(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
623,235✔
2445
                JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT,
2446
               "bad JIT_F_OPT_DEFAULT");
2447
    /* Folding disabled? Chain to CSE, but not for loads/stores/allocs. */
2448
    if (!(J->flags & JIT_F_OPT_FOLD) && irm_kind(lj_ir_mode[fins->o]) == IRM_N)
623,235✔
2449
      return lj_opt_cse(J);
270,188✔
2450

2451
    /* No FOLD, forwarding or CSE? Emit raw IR for loads, except for SLOAD. */
2452
    if ((J->flags & (JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE)) !=
353,047✔
2453
                    (JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE) &&
351,397✔
2454
        irm_kind(lj_ir_mode[fins->o]) == IRM_L && fins->o != IR_SLOAD)
351,397✔
2455
      return lj_ir_emit(J);
260,153✔
2456

2457
    /* No FOLD or DSE? Emit raw IR for stores. */
2458
    if ((J->flags & (JIT_F_OPT_FOLD|JIT_F_OPT_DSE)) !=
92,894✔
2459
                    (JIT_F_OPT_FOLD|JIT_F_OPT_DSE) &&
91,088✔
2460
        irm_kind(lj_ir_mode[fins->o]) == IRM_S)
91,088✔
2461
      return lj_ir_emit(J);
90,805✔
2462
  }
2463

2464
  /* Fold engine start/retry point. */
2465
retry:
7,460,740✔
2466
  /* Construct key from opcode and operand opcodes (unless literal/none). */
2467
  key = ((uint32_t)fins->o << 17);
7,639,361✔
2468
  if (fins->op1 >= J->cur.nk) {
7,639,361✔
2469
    key += (uint32_t)IR(fins->op1)->o << 10;
7,625,604✔
2470
    *fleft = *IR(fins->op1);
7,625,604✔
2471
    if (fins->op1 < REF_TRUE)
7,625,604✔
2472
      fleft[1] = IR(fins->op1)[1];
1,445,864✔
2473
  }
2474
  if (fins->op2 >= J->cur.nk) {
7,639,361✔
2475
    key += (uint32_t)IR(fins->op2)->o;
4,085,594✔
2476
    *fright = *IR(fins->op2);
4,085,594✔
2477
    if (fins->op2 < REF_TRUE)
4,085,594✔
2478
      fright[1] = IR(fins->op2)[1];
3,472,605✔
2479
  } else {
2480
    key += (fins->op2 & 0x3ffu);  /* Literal mask. Must include IRCONV_*MASK. */
3,553,767✔
2481
  }
2482

2483
  /* Check for a match in order from most specific to least specific. */
2484
  any = 0;
7,639,361✔
2485
  for (;;) {
33,693,969✔
2486
    uint32_t k = key | (any & 0x1ffff);
20,666,665✔
2487
    uint32_t h = fold_hashkey(k);
20,666,665✔
2488
    uint32_t fh = fold_hash[h];  /* Lookup key in semi-perfect hash table. */
20,666,665✔
2489
    if ((fh & 0xffffff) == k || (fh = fold_hash[h+1], (fh & 0xffffff) == k)) {
20,666,665✔
2490
      ref = (IRRef)tref_ref(fold_func[fh >> 24](J));
7,697,212✔
2491
      if (ref != NEXTFOLD)
7,697,211✔
2492
        break;
2493
    }
2494
    if (any == 0xfffff)  /* Exhausted folding. Pass on to CSE. */
14,760,717✔
2495
      return lj_opt_cse(J);
1,733,413✔
2496
    any = (any | (any >> 10)) ^ 0xffc00;
13,027,304✔
2497
  }
2498

2499
  /* Return value processing, ordered by frequency. */
2500
  if (LJ_LIKELY(ref >= MAX_FOLD))
5,905,947✔
2501
    return TREF(ref, irt_t(IR(ref)->t));
4,927,379✔
2502
  if (ref == RETRYFOLD)
978,568✔
2503
    goto retry;
178,621✔
2504
  if (ref == KINTFOLD)
799,947✔
2505
    return lj_ir_kint(J, fins->i);
24,389✔
2506
  if (ref == FAILFOLD)
775,558✔
2507
    lj_trace_err(J, LJ_TRERR_GFAIL);
6✔
2508
  lj_assertJ(ref == DROPFOLD, "bad fold result");
2509
  return REF_DROP;
2510
}
2511

2512
/* -- Common-Subexpression Elimination ------------------------------------ */
2513

2514
/* CSE an IR instruction. This is very fast due to the skip-list chains. */
2515
TRef LJ_FASTCALL lj_opt_cse(jit_State *J)
5,107,703✔
2516
{
2517
  /* Avoid narrow to wide store-to-load forwarding stall */
2518
  IRRef2 op12 = (IRRef2)fins->op1 + ((IRRef2)fins->op2 << 16);
5,107,703✔
2519
  IROp op = fins->o;
5,107,703✔
2520
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
5,107,703✔
2521
    /* Limited search for same operands in per-opcode chain. */
2522
    IRRef ref = J->chain[op];
4,838,649✔
2523
    IRRef lim = fins->op1;
4,838,649✔
2524
    if (fins->op2 > lim) lim = fins->op2;  /* Relies on lit < REF_BIAS. */
4,838,649✔
2525
    while (ref > lim) {
18,095,575✔
2526
      if (IR(ref)->op12 == op12)
16,109,997✔
2527
        return TREF(ref, irt_t(IR(ref)->t));  /* Common subexpression found. */
2,853,071✔
2528
      ref = IR(ref)->prev;
13,256,926✔
2529
    }
2530
  }
2531
  /* Otherwise emit IR (inlined for speed). */
2532
  {
2533
    IRRef ref = lj_ir_nextins(J);
2,254,632✔
2534
    IRIns *ir = IR(ref);
2,254,631✔
2535
    ir->prev = J->chain[op];
2,254,631✔
2536
    ir->op12 = op12;
2,254,631✔
2537
    J->chain[op] = (IRRef1)ref;
2,254,631✔
2538
    ir->o = fins->o;
2,254,631✔
2539
    J->guardemit.irt |= fins->t.irt;
2,254,631✔
2540
    return TREF(ref, irt_t((ir->t = fins->t)));
2,254,631✔
2541
  }
2542
}
2543

2544
/* CSE with explicit search limit. */
2545
TRef LJ_FASTCALL lj_opt_cselim(jit_State *J, IRRef lim)
176,344✔
2546
{
2547
  IRRef ref = J->chain[fins->o];
176,344✔
2548
  IRRef2 op12 = (IRRef2)fins->op1 + ((IRRef2)fins->op2 << 16);
176,344✔
2549
  while (ref > lim) {
699,641✔
2550
    if (IR(ref)->op12 == op12)
613,640✔
2551
      return ref;
90,343✔
2552
    ref = IR(ref)->prev;
523,297✔
2553
  }
2554
  return lj_ir_emit(J);
86,001✔
2555
}
2556

2557
/* ------------------------------------------------------------------------ */
2558

2559
#undef IR
2560
#undef fins
2561
#undef fleft
2562
#undef fright
2563
#undef knumleft
2564
#undef knumright
2565
#undef emitir
2566

2567
#endif
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc