Optimization¶
Introduction¶
Optimization provides an alternative approach to marginal inference.
In this section we refer to the program for which we would like to obtain the marginal distribution as the target program.
If we take a target program and add a guide distribution to each random choice, then we can define the guide
program as the program you get when you sample from the guide
distribution at each sample
statement and ignore all factor
statements.
If we endow this guide program with adjustable parameters, then we can optimize those parameters so as to minimize the distance between the joint distribution of the choices in the guide program and those in the target.
This general approach includes a number of wellknown algorithms as special cases.
It is supported in WebPPL by a method for performing optimization, primitives for specifying parameters, and the ability to specify guides.
Optimize¶

Optimize
(options)¶ Arguments:  options (object) – Optimization options.
Returns: Nothing.
Optimizes the parameters of the guide program specified by the
model
option.The following options are supported:

model
A function of zero arguments that specifies the target and guide programs.
This option must be present.

steps
The number of optimization steps to take.
Default:
1

optMethod
The optimization method used. The following methods are available:
'sgd'
'adagrad'
'rmsprop'
'adam'
Each method takes a
stepSize
suboption, see below for example usage. Additional method specific options are available, see the adnn optimization module for details.Default:
'adam'

estimator
Specifies the optimization objective and the method used to estimate its gradients. See Estimators.
Default:
ELBO

verbose
Default:
true
Example usage:
Optimize({model: model, steps: 100});
Optimize({model: model, optMethod: 'adagrad'});
Optimize({model: model, optMethod: {sgd: {stepSize: 0.5}}});
Estimators¶
The following estimators are available:

ELBO
This is the evidence lower bound (ELBO). Optimizing this objective yields variational inference.
For best performance use
mapData()
in place ofmap()
where possible when optimizing this objective. The conditional independence information this provides is used to reduce the variance of gradient estimates which can significantly improve performance, particularly in the presence of discrete random choices. Data subsampling is also supported through the use ofmapData()
.The following options are supported:

samples
The number of samples to take for each gradient estimate.
Default:
1

avgBaselines
Enable the “average baseline removal” variance reduction strategy.
Default:
true

avgBaselineDecay
The decay rate used in the exponential moving average used to estimate baselines.
Default:
0.9

Example usage:
Optimize({model: model, estimator: 'ELBO'});
Optimize({model: model, estimator: {ELBO: {samples: 10}}});
Parameters¶

param
([options]) Retrieves the value of a parameter by name. If the parameter does not exist, it is created and initialized with a draw from a Gaussian distribution.
The following options are supported:

dims
When
dims
is given,param
returns a tensor of dimensiondims
. In this casedims
should be an array.When
dims
is omitted,param
returns a scalar.

mu
The mean of the Gaussian distribution from which the initial parameter value is drawn.
Default:
0

sigma
The standard deviation of the Gaussian distribution from which the initial parameter value is drawn. Specify a standard deviation of
0
to deterministically initialize the parameter tomu
.Default:
0.1

name
The name of the parameter to retrieve. If
name
is omitted a default name is automatically generated based on the current stack address, relative to the current coroutine.
Examples:
param() param({name: 'myparam'}) param({mu: 0, sigma: 0.01, name: 'myparam'}) param({dims: [10, 10]})


modelParam
([options])¶ An analog of
param
used to create or retrieve a parameter that can be used directly in the model.Optimizing the ELBO yields maximum likelihood estimation for model parameters.
modelParam
cannot be used with other inference strategies as it does not have an interpretation in the fully Bayesian setting. Attempting to do so will raise an exception.modelParam
supports the same options asparam
. See the documentation for param for details.