Skip to content

Allow reusing function in pymc.compute_log_likelihood #7073

Closed
@DanielRobertNicoud

Description

@DanielRobertNicoud

Describe the issue:

Calling pymc.compute_log_likelihood multiple times on the same model leads to multiple compilations via compile_fn. This is a time sink (in some applications) that could easily be solved by storing the compiled function.

Reproduceable code example:

import numpy as np
import pymc

n = 5_000
n_feat = 3
X_train = np.random.normal(size=(n, n_feat))
y_train = np.random.normal(size=(n))

with pymc.Model() as model:
    # data containers
    X = pymc.MutableData("X", X_train)
    y = pymc.MutableData("y", y_train)
    # priors
    intercept = pymc.Normal("intercept", mu=0, sigma=1)
    b = pymc.MvNormal("b", mu=np.zeros(n_feat), cov=np.eye(n_feat))
    sigma = pymc.HalfCauchy("sigma", beta=10)
    mu = intercept + pymc.math.dot(X, b).flatten()
    # likelihood
    likelihood = pymc.Normal("obs", mu=mu, sigma=sigma, observed=y)

    idata = pymc.sample()

def compute_ll_twice(model, idata):
    n_test = 5
    X_test = np.random.normal(size=(n_test, n_feat))
    y_test = np.random.normal(size=(n_test))
    with model:
        pymc.set_data({"X": X_test, "y": y_test})
        for _ in range(5):
            out = pymc.compute_log_likelihood(idata, extend_inferencedata=False)

%prun -l 20 -s cumtime compute_ll_twice(model, idata)

Error message:

No response

PyMC version information:

pymc==5.10.1

Context for the issue:

In a research application I am writing, I need to call pymc.compute_log_likelihood many times (sometimes refitting the model in between). The calls to compile_fn take up to 60% of my computation time. If this could be easily fixed, it would be extremely helpful for me. Thank you very much in advance!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions