JPyInterpreter Architecture
The architecture of JPyInterpreter is composed of several components spread across Python and Java. The Python components are:
-
jvm_setup.py
, which sets up the JVM and the hooks JPyInterpreter uses to communicate with CPython (to look up packages, call native methods, and converting CPython object to Java object and vice-versa). -
python_to_java_bytecode_translator.py
, which acts as the Python’s frontend to JPyInterpreter. Users supply a CPython function to translate (and an optional Java functional interface it implements) totranslate_python_bytecode_to_java_bytecode
, which firsts converts that function to aPythonCompiledFunction
, and then passes it toPythonBytecodeToJavaBytecodeTranslator
to translate the function. It also provides thetranslate_python_class_to_java_class
function, which is given a user supplied CPython class, converts it to aPythonCompiledClass
, and passes it toPythonClassTranslator
to translate the class.
The Java components are:
-
PythonBytecodeToJavaBytecodeTranslator
, which acts as the entrypoint for function translation. It is responsible for:-
Setting up the
JavaPythonClassWriter
andMethodVisitor
used for bytecode generation. -
Creating and setting fields on the generated class objects (See PythonBytecodeToJavaBytecodeTranslator for details).
-
Do the leg work of moving/translating Java parameters (ex:
int
) intoPythonLikeObject
. -
Setup cells variables.
-
Delegating to
PythonGeneratorTranslator
when it detects the function being translated is a generator. -
Using
FlowGraph
to calculate theStackMetadata
of each instruction. -
Calling the
implement
method on every opcode in theOpcode
list with theStackMetadata
for the opcode and theFunctionMetadata
for the overall function.
-
-
PythonGeneratorTranslator
is likePythonBytecodeToJavaBytecodeTranslator
but for generators. It breaks a single generator function into multipleadvance
functions, and generates eachadvance
function bytecode independently. -
FlowGraph
, which calculates theStackMetadata
that corresponds to eachOpcode
. It is responsible for unifying theStackMetadata
from all jump sources for eachOpcode
that is a jump target. For instance, if two sources with the same target have postStackMetadata
of… int
and… bool
respectively,FlowGraph
will unify that to… int
(sincebool
is a subclass ofint
, for better or worse). -
StackMetadata
stores metadata about the stack and local variables. EachOpcode
get its ownStackMetadata
instance. It is mostly used to perform optimizations; for instance, if we detect the top two items on the stack areint
andint
for theBINARY_ADD
instruction, we can change the (normally complex due to Python semantics)BINARY_ADD
bytecode into a single method call. -
FunctionMetadata
stores metadata about the function (for instance, theMethodVisitor
to use to generate bytecode). EachOpcode
gets the sameFunctionMetadata
instance. -
Opcode
are the interface between CPython opcodes and theImplementors
. Each describe a particular operation, and usually (but not always) correspond to a CPython opcode. Some CPython opcodes map to the sameOpcode
implementation. -
Implementors
are responsible for generating the Java bytecode corresponding to CPython bytecode. They can be found in theimplementors
package.
The overall process of compiling a function looks like this:

Types
The builtin types for JPyInterpreter can be found in the types
package.
They all implement PythonLikeObject
, the interface the bytecode uses to represent arbitrary objects.
If type flow analysis determines a more specific type can be used (via StackMetadata
), the more specific type is used directly instead.
PythonLikeObject
have several methods:
-
__getAttributeOrNull
: returns the attribute with the given name if it exists, otherwise returns null. This is NOT__getattribute__
(which is implemented by$method$__getattribute__
instead). This is more akin toself.__dict__[attribute]
. The default$method$__getattribute__
uses it to get the attribute (with additional magic to handle descriptors, see the Python descriptor tutorial for more detail). -
__getAttributeOrError
: returns the attribute with the given name if it exists, otherwise raisesAttributeError
. Used in bytecode generation to lookup methods on types. -
__setAttribute
: sets the attribute with the given name to the given value. This is NOT__setattr__
(which is implemented by$method$__setattr__
instead). This is more akin toself.__dict__[attribute] = value
. The default$method$__setattr__
uses it to set the attribute. -
__deleteAttribute
: deletes the attribute with the given name. This is NOT__delattr__
(which is implemented by$method$__delattr__
instead). This is more akin todel self.__dict__[attribute]
. The default$method$__delattr__
uses it to delete the attribute. -
__getType
: returns the type of the object. Used to implementtype(object)
. -
__getGenericType
: returns the generic type of the object (ex:list[int]
). Used for typeflow analysis. -
$method$<method-name>
: the builtin methods on every object in Python. The$method$<method-name>
naming is to allow custom classes to override them (custom classes prefix method names with$method$
to not clash with Java method names).
PythonBytecodeToJavaBytecodeTranslator
The entrypoint for function translation, and the glue code for the many subsystems of the translator.
It is responsible for setting up the JavaPythonClassWriter
, MethodVisitor
and configuring the class' fields.
The fields it configures are:
-
co_consts
: Static; aList<PythonLikeObject>
that stores constants used in the bytecode. -
co_names
: Static; aList<PythonString>
that stores names used in the bytecode. -
co_varnames
: Static; aList<PythonString>
that stores variable names used in the bytecode. -
__globals__
: Static; aMap<String, PythonLikeObject>
used to read and store globals. -
__spec_getter__
: Static; aBiFunction<PythonLikeTuple, PythonLikeDict, ArgumentSpec<PythonLikeObject>>
that maps default arguments (which are per function) to anArgumentSpec
that can be used to set parameters. -
__defaults__
: Instance; aPythonLikeTuple
that stores default positional arguments. -
__kwdefaults__
: Instance; aPythonLikeDict
that stores default keyword arguments. -
__annotations__
: Instance; aPythonLikeDict
that stores type annotations on the function. -
__closure__
: Instance; aPythonLikeTuple
that stores the function’s closure (i.e. the free variable cells). -
__qualname__
: Instance; aPythonString
that stores the qualified name of the function. -
__spec__
: Instance; anArgumentSpec
that can be used to receive parameter (and correctly handle default arguments). -
__interpreter__
: Instance; thePythonInterpreter
this function runs in (used to perform imports and lookup unknown globals).
If a Python function cannot be translated for any reason (ex: native code), the following fields are also added:
-
__code__
: Static; an opaque pointer to the function’s CPython code object (used in the construct to make the wrapped CPython function). -
__function__
: Instance; aPythonObjectWrapper
that wraps the CPython function (used to call the CPython function).