The architecture of JPyInterpreter is composed of several components spread across Python and Java. The Python components are:
jvm_setup.py, which sets up the JVM and the hooks JPyInterpreter uses to communicate with CPython (to look up packages, call native methods, and converting CPython object to Java object and vice-versa).
python_to_java_bytecode_translator.py, which acts as the Python’s frontend to JPyInterpreter. Users supply a CPython function to translate (and an optional Java functional interface it implements) to
translate_python_bytecode_to_java_bytecode, which firsts converts that function to a
PythonCompiledFunction, and then passes it to
PythonBytecodeToJavaBytecodeTranslatorto translate the function. It also provides the
translate_python_class_to_java_classfunction, which is given a user supplied CPython class, converts it to a
PythonCompiledClass, and passes it to
PythonClassTranslatorto translate the class.
The Java components are:
PythonBytecodeToJavaBytecodeTranslator, which acts as the entrypoint for function translation. It is responsible for:
Setting up the
MethodVisitorused for bytecode generation.
Creating and setting fields on the generated class objects (See PythonBytecodeToJavaBytecodeTranslator for details).
Do the leg work of moving/translating Java parameters (ex:
Setup cells variables.
PythonGeneratorTranslatorwhen it detects the function being translated is a generator.
FlowGraphto calculate the
StackMetadataof each instruction.
implementmethod on every opcode in the
Opcodelist with the
StackMetadatafor the opcode and the
FunctionMetadatafor the overall function.
PythonBytecodeToJavaBytecodeTranslatorbut for generators. It breaks a single generator function into multiple
advancefunctions, and generates each
advancefunction bytecode independently.
FlowGraph, which calculates the
StackMetadatathat corresponds to each
Opcode. It is responsible for unifying the
StackMetadatafrom all jump sources for each
Opcodethat is a jump target. For instance, if two sources with the same target have post
FlowGraphwill unify that to
boolis a subclass of
int, for better or worse).
StackMetadatastores metadata about the stack and local variables. Each
Opcodeget its own
StackMetadatainstance. It is mostly used to perform optimizations; for instance, if we detect the top two items on the stack are
BINARY_ADDinstruction, we can change the (normally complex due to Python semantics)
BINARY_ADDbytecode into a single method call.
FunctionMetadatastores metadata about the function (for instance, the
MethodVisitorto use to generate bytecode). Each
Opcodegets the same
Opcodeare the interface between CPython opcodes and the
Implementors. Each describe a particular operation, and usually (but not always) correspond to a CPython opcode. Some CPython opcodes map to the same
Implementorsare responsible for generating the Java bytecode corresponding to CPython bytecode. They can be found in the
The overall process of compiling a function looks like this:
The builtin types for JPyInterpreter can be found in the
They all implement
PythonLikeObject, the interface the bytecode uses to represent arbitrary objects.
If type flow analysis determines a more specific type can be used (via
StackMetadata), the more specific type is used directly instead.
PythonLikeObject have several methods:
__getAttributeOrNull: returns the attribute with the given name if it exists, otherwise returns null. This is NOT
__getattribute__(which is implemented by
$method$__getattribute__instead). This is more akin to
self.__dict__[attribute]. The default
$method$__getattribute__uses it to get the attribute (with additional magic to handle descriptors, see the Python descriptor tutorial for more detail).
__getAttributeOrError: returns the attribute with the given name if it exists, otherwise raises
AttributeError. Used in bytecode generation to lookup methods on types.
__setAttribute: sets the attribute with the given name to the given value. This is NOT
__setattr__(which is implemented by
$method$__setattr__instead). This is more akin to
self.__dict__[attribute] = value. The default
$method$__setattr__uses it to set the attribute.
__deleteAttribute: deletes the attribute with the given name. This is NOT
__delattr__(which is implemented by
$method$__delattr__instead). This is more akin to
del self.__dict__[attribute]. The default
$method$__delattr__uses it to delete the attribute.
__getType: returns the type of the object. Used to implement
__getGenericType: returns the generic type of the object (ex:
list[int]). Used for typeflow analysis.
$method$<method-name>: the builtin methods on every object in Python. The
$method$<method-name>naming is to allow custom classes to override them (custom classes prefix method names with
$method$to not clash with Java method names).
The entrypoint for function translation, and the glue code for the many subsystems of the translator.
It is responsible for setting up the
MethodVisitor and configuring the class' fields.
The fields it configures are:
co_consts: Static; a
List<PythonLikeObject>that stores constants used in the bytecode.
co_names: Static; a
List<PythonString>that stores names used in the bytecode.
co_varnames: Static; a
List<PythonString>that stores variable names used in the bytecode.
__globals__: Static; a
Map<String, PythonLikeObject>used to read and store globals.
__spec_getter__: Static; a
BiFunction<PythonLikeTuple, PythonLikeDict, ArgumentSpec<PythonLikeObject>>that maps default arguments (which are per function) to an
ArgumentSpecthat can be used to set parameters.
__defaults__: Instance; a
PythonLikeTuplethat stores default positional arguments.
__kwdefaults__: Instance; a
PythonLikeDictthat stores default keyword arguments.
__annotations__: Instance; a
PythonLikeDictthat stores type annotations on the function.
__closure__: Instance; a
PythonLikeTuplethat stores the function’s closure (i.e. the free variable cells).
__qualname__: Instance; a
PythonStringthat stores the qualified name of the function.
__spec__: Instance; an
ArgumentSpecthat can be used to receive parameter (and correctly handle default arguments).
__interpreter__: Instance; the
PythonInterpreterthis function runs in (used to perform imports and lookup unknown globals).
If a Python function cannot be translated for any reason (ex: native code), the following fields are also added:
__code__: Static; an opaque pointer to the function’s CPython code object (used in the construct to make the wrapped CPython function).
__function__: Instance; a
PythonObjectWrapperthat wraps the CPython function (used to call the CPython function).