Skip to content

Add sample_prior function #7833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AllenDowney
Copy link

@AllenDowney AllenDowney commented Jun 23, 2025

Back in 2018, I participated in a discussion of an enhancement of PyMC, adding functions to sample the prior, prior predictive, and posterior predictive distributions.

Here’s the PR: #2983

I suggested an API with four functions:

  1. sample_prior
  2. sample_prior_predictive
  3. sample_posterior (synonym for sample)
  4. sample_posterior_predictive (synonym for sample_ppc)

In the end, only sample_prior_predictive and sample_posterior_predictive were added.

Reflecting on that decision, I think there are cases where we want to sample from the prior only and not from the prior predictive. For example, in a model that contains a Potential, the sample from the prior predictive would not reflect the Potential and might be misleading (there’s a warning about that). Also, I have encountered a case where sampling from the prior worked correctly, but sampling from the prior predictive generates an error (sometimes).

As an alternative, you could call sample_prior_predictive and specify var_names, but sample_prior as a wrapper for sample_prior_predictive automates the process of identifying unobserved vars and deterministics (that don't depend on observed vars).

So I am considering adding sample_prior to PyMC. What do we think?


📚 Documentation preview 📚: https://pymc--7833.org.readthedocs.build/en/7833/

Copy link

welcome bot commented Jun 23, 2025

Thank You Banner]
💖 Thanks for opening this pull request! 💖 The PyMC community really appreciates your time and effort to contribute to the project. Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 25, 2025

A clarification on the potential question.

It is not about prior vs prior_predictive. It's about whether you do forward or inverse (aka mcmc) sampling. Potentials only act on the logp so they can't be accounted for by forward sampling.

A Potential can be part of the prior, likelihood or both. The current API doesn't allow disambiguating the meaning and I don't think graph analysis can do it either.

Potentials are a case we warn loudly about but they aren't the only that fails. Other cases include using Flat priors or non-default transforms like Ordered. All these should fail or emit a warning when doing forward sampling, be it through prior, prior_predictive or posterior_predictive.

What could be useful would be to allow using mcmc for prior/prior_predictive for the cases where the forward graph is not compatible. I don't think we can do it for posterior_predictive because we would need one / several good draws for every posterior draw.

I don't love the mangling between "kind of distribution" and method of sampling. A better mental model is that sample uses mcmc sampling (what you obtain is up to you, it could be a prior, a posterior or a posterior_predictive... a prior is someone else's posterior). sample_prior_predictive and sample_posterior_predictive use forward sampling and what you obtain with this is again up to you.

All this is pretty orthogonal to the PR but it was a good excuse for my rant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants