Working with Code Objects

The Code type is the foundational abstraction in codetransformer. It provides high-level APIs for working with logically-grouped sets of instructions and for converting to and from CPython’s native code type.

Constructing Code Objects

The most common way constructing a Code object is to use the from_pycode() classmethod, which accepts a CPython code object.

There are two common ways of building raw code objects:

  • CPython functions have a __code__ attribute, which contains the bytecode executed by the function.
  • The compile() builtin can compile a string of Python source code into a code object.

Using from_pycode(), we can build a Code object and inspect its contents:

>>> from codetransformer import Code
>>> def add2(x):
...     return x + 2
...
>>> co = Code.from_pycode(add.__code__)
>>> co.instrs
(LOAD_FAST('x'), LOAD_CONST(2), BINARY_ADD, RETURN_VALUE)
>>> co.argnames
('x',)
>>> c.consts
(2,)

We can convert our Code object back into its raw form via the to_bytecode() method:

>>> co.to_pycode()
<code object add2 at 0x7f6ba05f2030, file "<stdin>", line 1>

Building Transformers

Once we have the ability to convert to and from an abstract code representation, we gain the ability to perform transformations on that abtract representation.

Let’s say that we want to replace the addition operation in our add2 function with a multiplication. We could try to mutate our Code object directly before converting back to Python bytecode, but there are many subtle invariants [1] between the instructions and the other pieces of metadata that must be maintained to ensure that the generated output can be executed correctly.

Rather than encourage users to mutate Code objects in place, codetransformer provides the CodeTransformer class, which allows users to declaratively describe operations to perform on sequences of instructions.

Implemented as a CodeTransformer, our “replace additions with multiplications” operation looks like this:

from codetransformer import CodeTransformer, pattern
from codetransformer.instructions import BINARY_ADD, BINARY_MULTIPLY


class add2mul(CodeTransformer):
    @pattern(BINARY_ADD)
    def _add2mul(self, add_instr):
        yield BINARY_MULTIPLY().steal(add_instr)

The important piece here is the _add2mul method, which has been decorated with a pattern. Patterns provide an API for describing sequences of instructions to match against for replacement and/or modification. The CodeTransformer base class looks at methods with registered patterns and compares them against the instructions of the Code object under transformation. For each matching sequence of instructions, the decorated method is called with all matching instructions *-unpacked into the method. The method’s job is to take the input instructions and return an iterable of new instructions to serve as replacements. It is often convenient to implement transformer methods as generator functions, as we’ve done here.

In this example, we’ve supplied the simplest possible pattern: a single instruction type to match. [2] Our transformer method will be called on every BINARY_ADD instruction in the target code object, and it will yield a BINARY_MULTIPLY as replacement each time.

Applying Transformers

To apply a CodeTransformer to a function, we construct an instance of the transformer and call it on the function we want to modify. The result is a new function whose instructions have been rewritten applying our transformer’s methods to matched sequences of the input function’s instructions. The original function is not mutated in place.

Example:

>>> transformer = add2mul()
>>> mul2 = transformer(add2) # mult2 is a brand-new function
>>> mul2(5)
10

When we don’t care about having access to the pre-transformed version of a function, it’s convenient and idiomatic to apply transformers as decorators:

>>> @add2mul()
... def mul2(x):
...     return x + 2
...
>>> mul2(5)
10
[1]For example, if we add a new constant, we have to ensure that we correctly maintain the indices of existing constants in the generated code’s co_consts, and if we replace an instruction that was the target of a jump, we have to make sure that the jump instruction resolves correctly to our new instruction.
[2]Many more complex patterns are possible. See the docs for codetransformer.patterns.pattern for more examples.