Skip to content

Pinning parameters #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 153 additions & 0 deletions designs/0035-pin-parameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
- Feature Name: pin-parameters
- Start Date: 2025-05-05
- RFC PR:
- Stan Issue:

# Summary
[summary]: #summary

It is often useful to be able to 'pin' the otherwise free parameters in a statistical model to specific values. First, this is useful when debugging a statistical model to diagnose computational problems or understand the priors. Second, it is useful when one wishes to explore a simpler model that is nested inside an extended one. e.g., the mu = 0 no effect model that is nested inside the mu != 0 model of a new effect of size mu. This proposal makes pinning straight-forward at runtime.

# Motivation
[motivation]: #motivation

At present, to pin a parameter a Stan model must be rewritten. We must either:

- move a parameter from the parameter block to the data block, where it is pinned to a fixed value
- add convoluted logic so that a boolean (more precisely an `integer<lower=0, upper=1>`) in the data block can control whether a parameter is pinned, e.g., (from [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/4?u=andrewfowlie))
```stan
data {
int<lower=0, upper=1> mu_is_data;
array[mu_is_data] real mu_data;
...
parameters {
array[1 - mu_is_data] real mu_param;
...
transformed parameters {
real mu = mu_is data ? mu_data[1] : mu_param[1];
...
```

This pattern is inelegant and unsatisfactory, as it is clunky and obfuscates the inherent generative structure of the model.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

The following section is a draft of the docs for the new `pin` keyword-value pair in `cmdstan` command-line options that would appear [here](https://mc-stan.org/docs/cmdstan-guide/command_line_options.html). We don't at this stage make any proposal for how this feature would be propagated to other Stan interfaces, they do not anticipate any difficulties.

## Command-Line Interface Overview

A `pin` argument is added to the command-line interface. This argument appears parallel to `init` in the configuration hierarchy and applies to all Stan interfaces.

...
- `init` - ...
- `pin` - specifies values for any parameters that should be pinned, if any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CmdStan is very particular about the hierarchical structure of arguments, so this needs to be clarified in terms of where it is possible. Which interfaces will it apply to?

  • HMC, NUTS sampling
  • ADVI, Pathfinder VI
  • Laplace approximation
  • plain old optimization

Each of these has its own particular argument structure. For example, here's sampling's default config hierarcharchy into which pin will need to be slotted:

method = sample (Default)
  sample
    num_samples = 1000 (Default)
    num_warmup = 1000 (Default)
    save_warmup = false (Default)
    thin = 1 (Default)
    adapt
      engaged = true (Default)
      gamma = 0.050000000000000003 (Default)
      delta = 0.80000000000000004 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
      save_metric = false (Default)
    algorithm = hmc (Default)
      hmc
        engine = nuts (Default)
          nuts
            max_depth = 10 (Default)
        metric = diag_e (Default)
        metric_file =  (Default)
        stepsize = 1 (Default)
        stepsize_jitter = 0 (Default)
    num_chains = 1 (Default)
id = 0 (Default)
data
  file = bernoulli.data.json
init = 2 (Default)
random
  seed = 3252652196 (Default)
output
  file = output.csv (Default)
  diagnostic_file =  (Default)
  refresh = 100 (Default)

I would suggest making it parallel to init. This placing should be available for all of the inference options.

- `random` - ...
...

### Pin model parameters argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section needs to make it clear that variables are going to be pinned or not pinned---they can't be partially pinned. Suppose I have the following program.

vector[10] a;

I can pin all 10 components of a or none of them. I can't just pin a[3] and a[7] and leave the others as parameters, for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that this restriction is primarily due to the lack of a good design, NOT due to mathematical concerns like the restriction on constrained parameters.

If we could agree on what the specification for the JSON file that only specifies a[3] and a[7] looks like, we could plumb enough information through the compiler and inference to make it work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this restriction is primarily due to the lack of a good design, NOT due to mathematical concerns like the restriction on constrained parameters.

It's fine mathematically to pin a constrained parameter. It's just too challenging on the Stan side to evaluate the implied constraints on remaining parameters and whether the result is even consistent or expressible in Stan.

In the same way, pinning just two values is more of a problem in terms of rewriting the Stan model than in terms of math. If we pin just two values, it's like this:

data {
  vector[2]  pinned_a;
  ...
parameters {
  vector[8] unpin_a
  ...  
transformed parameters {
  vector[10] a;
  a[1:2] = unpin_a[1:2];
  a[3] = pinned_a[1];
  a[4:6] = unpin_a[3:5];
  a[7] = pinned_a[2];
  a[8:10] = unpin_a[6:8];
  ...

I don't know an equivalently simple workaround to the one @andrewfowlie listed in the original proposal for fully pinned parameters.


Parameters defined in the parameters block can be 'pinned' to specific values. This is useful when debugging a model or exploring a simpler model that is nested inside an extended one.

By default, no parameters are pinned. The pinned parameters are read from an input data file in JSON format using the syntax:
```
pin=<filepath>
```
The value must be a filepath to a JSON file containing pinned values for some or all of the parameters in the parameters block. This file should be in the same JSON format as that used for other Stan files (e.g. `init`); see [here](https://mc-stan.org/docs/cmdstan-guide/json_apdx.html#creating-json-files) for more information about JSON and creating JSON files.

At present, there are two restrictions on parameters that can be pinned.

1. You cannot pin a subset of elements of a non-scalar parameter (e.g, `vector`, `array`, `matrix` or `tuple`); all elements must be pinned or else none must be pinned. E.g., consider
```stan
parameters {
vector[5] x;
}
```
We can pin all 5 elements of `x` or no elements. We cannot pin, e.g., only the first element `x[1]`.
2. You cannot pin constrained parameters. E.g., consider
```stan
parameters {
real<lower=0> x;
}
```
Because it is constrained, we cannot pin `x`.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

** to be discussed **

# Drawbacks
[drawbacks]: #drawbacks

1. It is another command-line argument and there are already several.
2. The same thing can be achieved by coding the model in a more complicated way, as shown above.
3. It changes the *interpretation* of a Stan model, though in a very explicit way.
4. The two restrictions, particularly that one cannot pin constrained parameters, will limit use cases

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives


Pinning parameters at runtime is far more elegant than existing solutions. An alternative would be a new keyword in the Stan language itself, e.g., in a parameter constraint
```stan
parameters {
real<pin=0> mu;
}
```
or using a new annotation
```stan
parameters {
@pin(0)
real mu;
}
```
They could possibly be combined with the data block, e.g.,
```stan
data {
real mu_data;
}
parameters {
real<pin=mu_data> mu;
}
```
These possibilities are certainly neater than
```stan
data {
int<lower=0, upper=1> mu_is_data;
array[mu_is_data] real mu_data;
...
parameters {
array[1 - mu_is_data] real mu_param;
...
transformed parameters {
real mu = mu_is data ? mu_data[1] : mu_param[1];
...
```
Even with `<pin=>` or `@pin`, pinning still requires one to change a model and recompile, and doesn't let us turn off pinning at runtime.

# Prior art
[prior-art]: #prior-art

`PyMC` has specific functionality for pinning. See [here](https://www.pymc.io/projects/docs/en/stable/api/model/generated/pymc.model.transform.conditioning.do.html). In `PyMC`, pinning (and perhaps other similar things) are called 'interventions'. The example given in the docs is this,
```python
import pymc as pm
import arviz as az # added to make example code work

with pm.Model() as m:
x = pm.Normal("x", 0, 1)
y = pm.Normal("y", x, 1)
z = pm.Normal("z", y + x, 1)

# Dummy posterior, same as calling `pm.sample`
idata_m = az.from_dict({rv.name: [pm.draw(rv, draws=500)] for rv in [x, y, z]})

# Replace `y` by a constant `100.0`
with pm.do(m, {y: 100.0}) as m_do:
idata_do = pm.sample_posterior_predictive(idata_m, var_names="z")
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great that you included a PyMC example.

I don't know if we want to mention this, but @WardBrian found a bug in their implementation where they don't respect constraints. Probably not relevant for this discussion.

Do you know how general this is in PyMC? If not, we can ask their devs. For example, does PyMC let me set just one component of a vector?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know PyMC that well and I've never used this feature. But just playing around with it, it seems like it's all or nothing.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

I don't know how this would be implemented technically, but there is a comment [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/7?u=andrewfowlie)