Python Function Structure

In Python, the details about a function, including its implementation and argument spec, are accessible from the function object. In jpyinterpreter, this information is stored in the class PythonCompiledFunction. It has several attributes. In particular, we are interested in:

  • __globals__: The globals dict that is used by the function. All classes defined in the same file share the same globals. Classes defined in different files have different globals. All globals variables are stored in this dict.

  • __closure__: Variables used by the function that were declared in an outer scope. for instance:

    def outer():
        x = 10
        def inner():
            return x
        return inner

    the function inner.__closure__ would be [Cell(value=10)]. Cells are used because changes in the outer function affects the inner function. For instance:

    def outer():
        x = 10
        def inner():
            return x
        x = 20
        return inner()  # returns 20
  • __defaults__: contains default values for "allow-positional" arguments. For example, for the function:

    def my_function(a, pos_only=1, / , allow_pos=2, * , keyword_only=3):
        pass

    __defaults__ would be (1, 2). Keyword-only default arguments are instead stored in __kwdefaults__, which for the above function would be {'keyword_only': 3}.

  • __qualname__: the qualified name of the function.

  • __annotations__: the type annotations of the function. For example, for the function:

    def typed_function(a: int, b: str) -> 'A':
        return A(a, b)

    __annotations__ would be {'a': int, 'b': str, 'return': 'A'}. Values in the __annotations__ dict should be a type or a string. If it is a string, it represents the type that has the same name as that string.

  • __code__: The code object of the function, containing the function bytecode and argument spec. Code objects are described below.

Code Objects

Code objects is the byte-compiled Python function. The primary difference between function objects and code objects is that function objects contain context (i.e. globals, defaults, closure) whereas code objects are context-free. Of its attributes, we are interested in:

  • co_names: A tuple of names used by the bytecode. For instance, the function

    def my_function():
        a = 1
        b = 2
        return max(a, b)

    has ('a', 'b', 'max') in its co_names attribute. It is referenced by the bytecode to load and store globals. For example, LOAD_GLOBAL 2 means "load the global with the name given by the entry in co_names at index 2" (i.e. max).

  • co_varnames: a tuple of variable names used by the bytecode (including parameters). For instance, the function:

    def outer_function(outer_parameter):
        def inner_function(inner_parameter):
            my_local = outer_parameter + inner_parameter
            return my_local * 2
        return inner_function

    inner_function’s co_varnames would be ('inner_parameter', 'my_local', 'outer_parameters'). co_varnames always starts with the function’s parameter, followed by local variables used in the function, followed by cell variables used in the function. Cell variables are variables that are stored in a cell. There are two types: free variables, which is a cell from an outer function (in this case, outer_parameter is a free variable of inner_function), and bound variables, which is a cell that is used in an inner function (in this case, outer_parameter is a bound variable of outer_function).

  • co_cellvars: a tuple of strings representing bound variable.

  • co_freevars: a tuple of strings representing free variable.

  • co_constants: a tuple of constants used in the function. For instance, the function:

    def my_function():
        a = 1
        b = 2
        return a + b

    co_constants would be (1, 2). In CPython, objects eligible to be constants are:

    • int objects (ex: 1, 42)

    • float objects (ex: 1.5, NaN)

    • bool objects (True and False)

    • str objects (ex: 'hello world')

    • bytes objects (ex: b’hello world')

    • None (None)

    • Ellipsis (…​)

      The following objects are not eligible to be used as constants, and are loaded with LOAD_GLOBAL instead:

    • class objects (ex: int, str, user defined classes)

    • function objects (ex: max, range, user defined functions)

    • Instances of user defined classes

  • co_exceptiontable: a mapping from bytecode instruction ranges to their exception handler. The exception handler have the following properties:

    • targetInstruction: the bytecode index to jump to

    • stackDepth: the stack depth prior to the try block (required for restoring the stack state)

    • pushLastIndex: A boolean, that if true, indicates the handler should push the bytecode index that raised the exception prior to pushing the exception (otherwise, only the exception is pushed).

      This attribute is only present in Python 3.11 and above; previous versions of Python use bytecode instruction to push and pop exception blocks. For example, the function:

      def my_function():
          for x in range(10):
              try:
                  y = other_function_1(x)
                  try:
                      other_function_2(x, y)
                  except:
                      print('other 2 exception')
              except:
                  print('other 1 exception')

      would have (simplified) bytecode looking like this:

      0  LOAD_GLOBAL 0 (range)
      1  LOAD_CONSTANT 1 (10)
      2  CALL 1
      3  GET_ITER
      4  FOR_ITER 19 (23 LOAD_CONSTANT)
      5  STORE_LOCAL 0 (x)
      6  LOAD_GLOBAL 1 (other_function_1)
      7  LOAD_LOCAL 0 (x)
      8  CALL 1
      9  STORE_LOCAL 1 (y)
      10 LOAD_GLOBAL 2 (other_function_2)
      11 LOAD_LOCAL 0 (x)
      12 LOAD_LOCAL 1 (y)
      13 CALL 2
      14 JUMP_BACKWARDS 10 (4 FOR_ITER)
      15 POP_TOP
      16 LOAD_GLOBAL 3 (print)
      17 LOAD_CONSTANT 2 ('other 2 exception')
      18 JUMP_BACKWARDS 14 (4 FOR_ITER)
      19 POP_TOP
      20 LOAD_GLOBAL 3 (print)
      21 LOAD_CONSTANT 2 ('other 1 exception')
      22 JUMP_BACKWARDS 18 (4 FOR_ITER)
      23 LOAD_CONSTANT 0 (None)
      24 RETURN

      For simplicity, the code that handles exceptions in exception handlers has been excluded. The above bytecode would have the corresponding exception table below:

      Exception Table:
      [12, 14) -> 15 (stack-depth: 1)
      [5,  18) -> 19 (stack-depth: 1)

      Stack depth is 1 because the iterator was on the stack prior to the try block.

  • co_argcount the number of allow-positional arguments the function takes. For example, for the function below:

    def my_function(a, b, /, c, d=10, *, e=20):
        pass

    co_argcount would be 4, since a, b, c, and d can be specified as a positional argument (a, b are required positional only arguments, c is a positional-or-keyword required argument, d is a positional-or-keyword optional argument, e is a keyword-only optional argument).

  • co_kwonlyargcount is the number of keyword-only arguments. For the example given in co_argcount, it would be 1, since e is the only keyword-only argument.

  • co_posonlyargcount is the number of positional-only arguments. For the example given in co_argcount, it would be 2, since a and b are the only positional-only arguments.