From 61fdacec76bd2500f695566b4a67872f6cace7d6 Mon Sep 17 00:00:00 2001 From: Andrew Fowlie Date: Mon, 5 May 2025 16:42:48 +0800 Subject: [PATCH 1/4] add first draft of proposal --- designs/0035-pin-parameters.md | 122 +++++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 designs/0035-pin-parameters.md diff --git a/designs/0035-pin-parameters.md b/designs/0035-pin-parameters.md new file mode 100644 index 0000000..d166649 --- /dev/null +++ b/designs/0035-pin-parameters.md @@ -0,0 +1,122 @@ +- Feature Name: pin-parameters +- Start Date: 2025-05-05 +- RFC PR: +- Stan Issue: + +# Summary +[summary]: #summary + +It is often useful to be able to 'pin' the otherwise free parameters in a statistical model to specific values. First, this is useful when debugging a statistical model to diagnose computational problems or understand the priors. Second, it is useful when one wishes to explore a simpler model that is nested inside an extended one. e.g., the mu = 0 no effect model that is nested inside the mu != 0 model of a new effect of size mu. This proposal makes pinning straight-forward at runtime. + +# Motivation +[motivation]: #motivation + +At present, to pin a parameter a Stan model must be rewritten. We must either: + +- move a parameter from the parameter block to the data block, where it is pinned to a fixed value +- add convoluted logic so that a boolean (more precisely an integer) in the data block can control whether a parameter is pinned, e.g., (from [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/4?u=andrewfowlie)) +``` +data { + int mu_is_data; + array[mu_is_data] mu_val; + ... +parameters { + array[1 - mu_is_data] mu_param; + ... +transformed parameters { + real mu = mu_is data ? mu_val[1] : mu_param[1]; + ... +``` + +These are inelegant and unsatisfactory, partly as they require rewriting and recompiling a model. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +The following section is a draft of the docs for the new `pin` keyword-value pair in `cmdstan` command-line options that would appear [here](https://mc-stan.org/docs/cmdstan-guide/command_line_options.html). We don't at this stage make any proposal for how this feature would be propagated to other Stan interfaces, they do not anticipate any difficulties. + +## Command-Line Interface Overview + +... + +- `pin` - specifies values for any parameters that should be pinned, if any + +... + +### Pin model parameters argument + +Parameters defined in the parameters block can be 'pinned' to specific values. This is useful when debugging a model or exploring a simpler model that is nested inside an extended one. + +By default, no parameters are pinned. The pinned parameters are read from an input data file in JSON format using the syntax: +``` +pin= +``` +The value must be a filepath to a JSON file containing pinned values for some or all of the parameters in the parameters block. + +At present, there are two restrictions on parameters that can be pinned: + +- you cannot pin a subset of elements of a vector; all elements must be pinned +- you cannot pin constrained parameters + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +** to be discussed ** + +# Drawbacks +[drawbacks]: #drawbacks + +- It's another command-line argument and there are already several +- The same thing can be achieved by re-programming the model. +- It changes the *interpretation* of a Stan model, though in a very explicit way +- The two restrictions, particularly cannot pin constrained parameters, might limit use-cases + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + + +I think pinning parameters at runtime is far more elegant than existing solutions. At first, I had thought about a new keyword in the Stan language itself, e.g., in a parameter constraint +``` +parameters { + real mu; +} +``` +It's certainly neater than +``` +data { + int mu_is_data; + array[mu_is_data] mu_val; + ... +parameters { + array[1 - mu_is_data] mu_param; + ... +transformed parameters { + real mu = mu_is data ? mu_val[1] : mu_param[1]; + ... +``` +but even with ``, pinning still requires one to change a model and recompile. + +# Prior art +[prior-art]: #prior-art + +`PyMC` has specific functionality for pinning. See [here](https://www.pymc.io/projects/docs/en/stable/api/model/generated/pymc.model.transform.conditioning.do.html). In `PyMC`, pinning (and perhaps other similar things) are called 'interventions'. The example given in the docs is this, +``` +import pymc as pm + +with pm.Model() as m: + x = pm.Normal("x", 0, 1) + y = pm.Normal("y", x, 1) + z = pm.Normal("z", y + x, 1) + +# Dummy posterior, same as calling `pm.sample` +idata_m = az.from_dict({rv.name: [pm.draw(rv, draws=500)] for rv in [x, y, z]}) + +# Replace `y` by a constant `100.0` +with pm.do(m, {y: 100.0}) as m_do: + idata_do = pm.sample_posterior_predictive(idata_m, var_names="z") +``` + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +I don't know how this would be implemented technically, but there is a comment [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/7?u=andrewfowlie) From 913b746b1111d19600088c65d57dbe35b11bb9f3 Mon Sep 17 00:00:00 2001 From: Andrew Fowlie Date: Tue, 6 May 2025 12:28:16 +0800 Subject: [PATCH 2/4] fix code snippets --- designs/0035-pin-parameters.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/designs/0035-pin-parameters.md b/designs/0035-pin-parameters.md index d166649..b18b135 100644 --- a/designs/0035-pin-parameters.md +++ b/designs/0035-pin-parameters.md @@ -15,16 +15,16 @@ At present, to pin a parameter a Stan model must be rewritten. We must either: - move a parameter from the parameter block to the data block, where it is pinned to a fixed value - add convoluted logic so that a boolean (more precisely an integer) in the data block can control whether a parameter is pinned, e.g., (from [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/4?u=andrewfowlie)) -``` +```stan data { int mu_is_data; - array[mu_is_data] mu_val; + array[mu_is_data] real mu_data; ... parameters { - array[1 - mu_is_data] mu_param; + array[1 - mu_is_data] real mu_param; ... transformed parameters { - real mu = mu_is data ? mu_val[1] : mu_param[1]; + real mu = mu_is data ? mu_data[1] : mu_param[1]; ... ``` @@ -76,22 +76,22 @@ At present, there are two restrictions on parameters that can be pinned: I think pinning parameters at runtime is far more elegant than existing solutions. At first, I had thought about a new keyword in the Stan language itself, e.g., in a parameter constraint -``` +```stan parameters { real mu; } ``` It's certainly neater than -``` +```stan data { int mu_is_data; - array[mu_is_data] mu_val; + array[mu_is_data] real mu_data; ... parameters { - array[1 - mu_is_data] mu_param; + array[1 - mu_is_data] real mu_param; ... transformed parameters { - real mu = mu_is data ? mu_val[1] : mu_param[1]; + real mu = mu_is data ? mu_data[1] : mu_param[1]; ... ``` but even with ``, pinning still requires one to change a model and recompile. @@ -100,7 +100,7 @@ but even with ``, pinning still requires one to change a model and recompi [prior-art]: #prior-art `PyMC` has specific functionality for pinning. See [here](https://www.pymc.io/projects/docs/en/stable/api/model/generated/pymc.model.transform.conditioning.do.html). In `PyMC`, pinning (and perhaps other similar things) are called 'interventions'. The example given in the docs is this, -``` +```python import pymc as pm with pm.Model() as m: From 910b687c10c22e35ee40ede5661bd908372d4163 Mon Sep 17 00:00:00 2001 From: Andrew Fowlie Date: Tue, 6 May 2025 12:52:08 +0800 Subject: [PATCH 3/4] tweak text to address feedback --- designs/0035-pin-parameters.md | 62 +++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 16 deletions(-) diff --git a/designs/0035-pin-parameters.md b/designs/0035-pin-parameters.md index b18b135..ec3c0a4 100644 --- a/designs/0035-pin-parameters.md +++ b/designs/0035-pin-parameters.md @@ -14,7 +14,7 @@ It is often useful to be able to 'pin' the otherwise free parameters in a statis At present, to pin a parameter a Stan model must be rewritten. We must either: - move a parameter from the parameter block to the data block, where it is pinned to a fixed value -- add convoluted logic so that a boolean (more precisely an integer) in the data block can control whether a parameter is pinned, e.g., (from [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/4?u=andrewfowlie)) +- add convoluted logic so that a boolean (more precisely an `integer`) in the data block can control whether a parameter is pinned, e.g., (from [here](https://discourse.mc-stan.org/t/fixing-parameters-in-a-model/39035/4?u=andrewfowlie)) ```stan data { int mu_is_data; @@ -28,7 +28,7 @@ transformed parameters { ... ``` -These are inelegant and unsatisfactory, partly as they require rewriting and recompiling a model. +This pattern is inelegant and unsatisfactory, as it is clunky and obfuscates the inherent generative structure of the model. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -37,10 +37,12 @@ The following section is a draft of the docs for the new `pin` keyword-value pai ## Command-Line Interface Overview -... +A `pin` argument is added to the command-line interface. This argument appears parallel to `init` in the configuration hierarchy and applies to all Stan interfaces. +... +- `init` - ... - `pin` - specifies values for any parameters that should be pinned, if any - +- `random` - ... ... ### Pin model parameters argument @@ -51,12 +53,24 @@ By default, no parameters are pinned. The pinned parameters are read from an inp ``` pin= ``` -The value must be a filepath to a JSON file containing pinned values for some or all of the parameters in the parameters block. +The value must be a filepath to a JSON file containing pinned values for some or all of the parameters in the parameters block. This file should be in the same JSON format as that used for other Stan files (e.g. `init`); see [here](https://mc-stan.org/docs/cmdstan-guide/json_apdx.html#creating-json-files) for more information about JSON and creating JSON files. -At present, there are two restrictions on parameters that can be pinned: +At present, there are two restrictions on parameters that can be pinned. -- you cannot pin a subset of elements of a vector; all elements must be pinned -- you cannot pin constrained parameters +1. You cannot pin a subset of elements of a non-scalar parameter (e.g, `vector`, `array`, `matrix` or `tuple`); all elements must be pinned or else none must be pinned. E.g., consider +```stan +parameters { + vector[5] x; +} +``` +We can pin all 5 elements of `x` or no elements. We cannot pin, e.g., only the first element `x[1]`. +2. You cannot pin constrained parameters. E.g., consider +```stan +parameters { + real x; +} +``` +Because it is constrained, we cannot pin `x`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -66,22 +80,38 @@ At present, there are two restrictions on parameters that can be pinned: # Drawbacks [drawbacks]: #drawbacks -- It's another command-line argument and there are already several -- The same thing can be achieved by re-programming the model. -- It changes the *interpretation* of a Stan model, though in a very explicit way -- The two restrictions, particularly cannot pin constrained parameters, might limit use-cases +1. It is another command-line argument and there are already several. +2. The same thing can be achieved by coding the model in a more complicated way, as shown above. +3. It changes the *interpretation* of a Stan model, though in a very explicit way. +4. The two restrictions, particularly that one cannot pin constrained parameters, will limit use cases # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives -I think pinning parameters at runtime is far more elegant than existing solutions. At first, I had thought about a new keyword in the Stan language itself, e.g., in a parameter constraint +Pinning parameters at runtime is far more elegant than existing solutions. An alternative would be a new keyword in the Stan language itself, e.g., in a parameter constraint ```stan parameters { - real mu; + real mu; +} +``` +or using a new annotation +```stan +parameters { + @pin(0) + real mu; +} +``` +They could possibly be combined with the data block, e.g., +```stan +data { + real mu_data; +} +parameters { + real mu; } ``` -It's certainly neater than +These possibilities are certainly neater than ```stan data { int mu_is_data; @@ -94,7 +124,7 @@ transformed parameters { real mu = mu_is data ? mu_data[1] : mu_param[1]; ... ``` -but even with ``, pinning still requires one to change a model and recompile. +Even with `` or `@pin`, pinning still requires one to change a model and recompile, and doesn't let us turn off pinning at runtime. # Prior art [prior-art]: #prior-art From ea7d3f0601f8fdba8963e4f627bf49e2e8de89e0 Mon Sep 17 00:00:00 2001 From: Andrew Fowlie Date: Tue, 6 May 2025 13:08:35 +0800 Subject: [PATCH 4/4] tweak pymc snippet so that it works --- designs/0035-pin-parameters.md | 1 + 1 file changed, 1 insertion(+) diff --git a/designs/0035-pin-parameters.md b/designs/0035-pin-parameters.md index ec3c0a4..2b07b6b 100644 --- a/designs/0035-pin-parameters.md +++ b/designs/0035-pin-parameters.md @@ -132,6 +132,7 @@ Even with `` or `@pin`, pinning still requires one to change a model and r `PyMC` has specific functionality for pinning. See [here](https://www.pymc.io/projects/docs/en/stable/api/model/generated/pymc.model.transform.conditioning.do.html). In `PyMC`, pinning (and perhaps other similar things) are called 'interventions'. The example given in the docs is this, ```python import pymc as pm +import arviz as az # added to make example code work with pm.Model() as m: x = pm.Normal("x", 0, 1)