Skip to content

Stitching it all together #621

@markshannon

Description

@markshannon

We are currently building the region selector, optimizer and components to execute the optimized regions, but we lack an architecture link the regions together and give us good performance.

The general idea is that once we are executing in tier 2, we should try and stay in tier 2.

So exits should be able to adapt as they becomes hot.

We solve this problem with the Fundamental theorem of software engineering.

We want something that looks like an executor, but that takes a pointer to the pointer to the executor.

Something like:

typedef struct _PyAdaptiveExecutorObject {
    PyObject_HEAD
    struct _PyInterpreterFrame *(*execute)(struct _PyAdaptiveExecutorObject **ptr, PyInterpreterFrame *frame, PyObject **stack_pointer);
    /* Data needed by the executor goes here */
} _PyAdaptiveExecutorObject;

Note the additional indirection. This moves the memory load from the caller to the callee, making it a bit slower, but allows the callee executor to modify which executor the caller calls next time.

On exiting from tier 2 execution, instead of returning to the tier 1 interpreter, we should jump to an adaptive stub which checks for hotness and dispatches accordingly.

For side exits with additional micro-ops to exit, the code for such a side-exit stub would look like:

_PyInterpreterFrame *
cold_execute(_PyAdaptiveExecutorObject **ptr, _PyInterpreterFrame *frame, PyObject **stack_pointer)
{
    _PyAdaptiveExecutorObject *self = *ptr;
    if (is_hot(self->counter)) {
        _PyAdaptiveExecutorObject *executor = compile(self->uops);
        *ptr = executor;
        return executor->execute(ptr, frame, stack_pointer);
    }
    else {
        return uop_interpret(self->uops, frame, stack_pointer); 
    }
}

For an end exit (or a side exit with no attached uops) the code looks like this:

_PyInterpreterFrame *
end_execute(_PyAdaptiveExecutorObject **ptr, _PyInterpreterFrame *frame, PyObject **stack_pointer)
{
    _PyAdaptiveExecutorObject *self = *ptr;
    if (!is_hot(self->counter)) {
         return frame;  
    }
    if (frame->next_instr->op.code == ENTER_EXECUTOR) {
        _PyAdaptiveExecutorObject *executor = frame->f_code->co_executors[frame->next_instr->op.arg];
    }
    else {
        Excecutor *executor = NULL;
        _PyOptimizerObject opt = _PyInterpreterState_Get()->optimizer;
        int err = opt->optimize(opt, frame->f_code, frame->next_instr, &executor, stack_pointer);
        if (err < 0) {
            return NULL;
        }
    }
    *ptr = executor;
    return executor->execute(ptr, frame, stack_pointer);    
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    epic-tier2-optimizerLinear code region optimizer for 3.13 and beyond.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions