Closed
Description
Describe the issue:
Calling pymc.compute_log_likelihood
multiple times on the same model leads to multiple compilations via compile_fn
. This is a time sink (in some applications) that could easily be solved by storing the compiled function.
Reproduceable code example:
import numpy as np
import pymc
n = 5_000
n_feat = 3
X_train = np.random.normal(size=(n, n_feat))
y_train = np.random.normal(size=(n))
with pymc.Model() as model:
# data containers
X = pymc.MutableData("X", X_train)
y = pymc.MutableData("y", y_train)
# priors
intercept = pymc.Normal("intercept", mu=0, sigma=1)
b = pymc.MvNormal("b", mu=np.zeros(n_feat), cov=np.eye(n_feat))
sigma = pymc.HalfCauchy("sigma", beta=10)
mu = intercept + pymc.math.dot(X, b).flatten()
# likelihood
likelihood = pymc.Normal("obs", mu=mu, sigma=sigma, observed=y)
idata = pymc.sample()
def compute_ll_twice(model, idata):
n_test = 5
X_test = np.random.normal(size=(n_test, n_feat))
y_test = np.random.normal(size=(n_test))
with model:
pymc.set_data({"X": X_test, "y": y_test})
for _ in range(5):
out = pymc.compute_log_likelihood(idata, extend_inferencedata=False)
%prun -l 20 -s cumtime compute_ll_twice(model, idata)
Error message:
No response
PyMC version information:
pymc==5.10.1
Context for the issue:
In a research application I am writing, I need to call pymc.compute_log_likelihood
many times (sometimes refitting the model in between). The calls to compile_fn
take up to 60% of my computation time. If this could be easily fixed, it would be extremely helpful for me. Thank you very much in advance!