-
Notifications
You must be signed in to change notification settings - Fork 52
Description
We are currently building the region selector, optimizer and components to execute the optimized regions, but we lack an architecture link the regions together and give us good performance.
The general idea is that once we are executing in tier 2, we should try and stay in tier 2.
So exits should be able to adapt as they becomes hot.
We solve this problem with the Fundamental theorem of software engineering.
We want something that looks like an executor, but that takes a pointer to the pointer to the executor.
Something like:
typedef struct _PyAdaptiveExecutorObject {
PyObject_HEAD
struct _PyInterpreterFrame *(*execute)(struct _PyAdaptiveExecutorObject **ptr, PyInterpreterFrame *frame, PyObject **stack_pointer);
/* Data needed by the executor goes here */
} _PyAdaptiveExecutorObject;
Note the additional indirection. This moves the memory load from the caller to the callee, making it a bit slower, but allows the callee executor to modify which executor the caller calls next time.
On exiting from tier 2 execution, instead of returning to the tier 1 interpreter, we should jump to an adaptive stub which checks for hotness and dispatches accordingly.
For side exits with additional micro-ops to exit, the code for such a side-exit stub would look like:
_PyInterpreterFrame *
cold_execute(_PyAdaptiveExecutorObject **ptr, _PyInterpreterFrame *frame, PyObject **stack_pointer)
{
_PyAdaptiveExecutorObject *self = *ptr;
if (is_hot(self->counter)) {
_PyAdaptiveExecutorObject *executor = compile(self->uops);
*ptr = executor;
return executor->execute(ptr, frame, stack_pointer);
}
else {
return uop_interpret(self->uops, frame, stack_pointer);
}
}
For an end exit (or a side exit with no attached uops) the code looks like this:
_PyInterpreterFrame *
end_execute(_PyAdaptiveExecutorObject **ptr, _PyInterpreterFrame *frame, PyObject **stack_pointer)
{
_PyAdaptiveExecutorObject *self = *ptr;
if (!is_hot(self->counter)) {
return frame;
}
if (frame->next_instr->op.code == ENTER_EXECUTOR) {
_PyAdaptiveExecutorObject *executor = frame->f_code->co_executors[frame->next_instr->op.arg];
}
else {
Excecutor *executor = NULL;
_PyOptimizerObject opt = _PyInterpreterState_Get()->optimizer;
int err = opt->optimize(opt, frame->f_code, frame->next_instr, &executor, stack_pointer);
if (err < 0) {
return NULL;
}
}
*ptr = executor;
return executor->execute(ptr, frame, stack_pointer);
}