Skip to content

Avoid boolean objects when branching in the JIT #149238

@markshannon

Description

@markshannon

We currently specialize TO_BOOL for many common types.
This avoids the overhead of API calls, but we still need to load either True or False, then test against True or False

The additional cost of having to load and compare with Py_True and Py_False is expensive for what are often quite simple operations. E.g _TO_BOOL_LIST is 10 instructions (AArch64 linux) but only half of that is performing the comparison.

We can breakdown _TO_BOOL_FOO into _TO_BOOL_BIT_FOO; _BIT_TO_BOOL
and then optimize _BIT_TO_BOOL; _GUARD_IS_TRUE_POP to _GUARD_IS_TRUE_BIT_POP.

Where the "bit" versions produce a single bit boolean (0 for False, 1 for True).

Whereas _TO_BOOL_LIST is 10 instructions, hypothetical _TO_BOOL_BIT_LIST` would only be 5 instructions.

We already optimize _GUARD_IS_TRUE_BIT_POP to _GUARD_BIT_IS_SET_POP reducing the number of machine instructions from 5 to 2, but replacing it with _GUARD_IS_TRUE_BIT_POP would reduce it to a single machine instruction and remove the need for the replication in _GUARD_BIT_IS_SET_POP.

We can also replace many of the comparisons with a "bit" form, e.g. replacing _COMPARE_OP_FLOAT with _COMPARE_OP_BIT_FLOAT would reduce the code size from 19 to 13 instructions (21 to 14 accounting for the following guard as well).
[ Specializing for the actual operation, can further reduce the stencil size to 8 instructions ]


All instructions sizes are for the variant with all inputs in outputs in registers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetopic-JIT

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions