This will be wrapped in a make_template to ensure the variables are only
created once. It takes the input and returns the loc ('mu' in [Germain et
al. (2015)][1]) and log_scale ('alpha' in [Germain et al. (2015)][1]) from
the MADE network.
About Hidden Layers
Each element of hidden_layers should be greater than the input_depth
(i.e., input_depth = tf.shape(input)[-1] where input is the input to the
neural network). This is necessary to ensure the autoregressivity property.
About Clipping
This function also optionally clips the log_scale (but possibly not its
gradient). This is useful because if log_scale is too small/large it might
underflow/overflow making it impossible for the MaskedAutoregressiveFlow
bijector to implement a bijection. Additionally, the log_scale_clip_gradientbool indicates whether the gradient should also be clipped. The default does
not clip the gradient; this is useful because it still provides gradient
information (for fitting) yet solves the numerical stability problem. I.e.,
log_scale_clip_gradient = False means
grad[exp(clip(x))] = grad[x] exp(clip(x)) rather than the usual
grad[clip(x)] exp(clip(x)).
Args
hidden_layers
Python list-like of non-negative integer, scalars
indicating the number of units in each hidden layer. Default: [512, 512].
</td>
</tr><tr>
<td>shift_only<a id="shift_only"></a>
</td>
<td>
Pythonboolindicating if only theshiftterm shall be
computed. Default:False.
</td>
</tr><tr>
<td>activation<a id="activation"></a>
</td>
<td>
Activation function (callable). Explicitly setting toNoneimplies a linear activation.
</td>
</tr><tr>
<td>log_scale_min_clip<a id="log_scale_min_clip"></a>
</td>
<td>float-like scalarTensor, or aTensorwith the
same shape aslog_scale. The minimum value to clip by. Default: -5.
</td>
</tr><tr>
<td>log_scale_max_clip<a id="log_scale_max_clip"></a>
</td>
<td>float-like scalarTensor, or aTensorwith the
same shape aslog_scale. The maximum value to clip by. Default: 3.
</td>
</tr><tr>
<td>log_scale_clip_gradient<a id="log_scale_clip_gradient"></a>
</td>
<td>
Pythonboolindicating that the gradient of
<a href="https://www.tensorflow.org/api_docs/python/tf/clip_by_value"><code>tf.clip_by_value</code></a> should be preserved. Default:False.
</td>
</tr><tr>
<td>name<a id="name"></a>
</td>
<td>
A name for ops managed by this function. Default:
'masked_autoregressive_default_template'.
</td>
</tr><tr>
<td>args<a id="*args"></a>
</td>
<td>tf.layers.densearguments.
</td>
</tr><tr>
<td>*kwargs<a id="**kwargs"></a>
</td>
<td>tf.layers.dense` keyword arguments.
Returns
shift
Float-like Tensor of shift terms (the 'mu' in
[Germain et al. (2015)][1]).
log_scale
Float-like Tensor of log(scale) terms (the 'alpha' in
[Germain et al. (2015)][1]).
Raises
NotImplementedError
if rightmost dimension of inputs is unknown prior to
graph execution.
References
[1]: Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. MADE:
Masked Autoencoder for Distribution Estimation. In International
Conference on Machine Learning, 2015. https://arxiv.org/abs/1502.03509
[null,null,["Last updated 2023-11-21 UTC."],[],[],null,["# tfp.bijectors.masked_autoregressive_default_template\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/probability/blob/v0.23.0/tensorflow_probability/python/bijectors/masked_autoregressive.py#L498-L621) |\n\nBuild the Masked Autoregressive Density Estimator (Germain et al., 2015). (deprecated) \n\n tfp.bijectors.masked_autoregressive_default_template(\n hidden_layers,\n shift_only=False,\n activation=tf.nn.relu,\n log_scale_min_clip=-5.0,\n log_scale_max_clip=3.0,\n log_scale_clip_gradient=False,\n name=None,\n *args,\n **kwargs\n )\n\n| **Deprecated:** THIS FUNCTION IS DEPRECATED. It will be removed after 2020-02-01. Instructions for updating: `masked_autoregressive_default_template` is deprecated; use [`tfp.bijectors.AutoregressiveNetwork`](../../tfp/bijectors/AutoregressiveNetwork). Also, please note the section \"Variable Tracking\" in the documentation for [`tfp.bijectors.MaskedAutoregressiveFlow`](../../tfp/bijectors/MaskedAutoregressiveFlow).\n\nThis will be wrapped in a make_template to ensure the variables are only\ncreated once. It takes the input and returns the `loc` ('mu' in \\[Germain et\nal. (2015)\\]\\[1\\]) and `log_scale` ('alpha' in \\[Germain et al. (2015)\\]\\[1\\]) from\nthe MADE network.\n| **Warning:** This function uses `masked_dense` to create randomly initialized `tf.Variables`. It is presumed that these will be fit, just as you would any other neural architecture which uses `tf.layers.dense`.\n\n#### About Hidden Layers\n\nEach element of `hidden_layers` should be greater than the `input_depth`\n(i.e., `input_depth = tf.shape(input)[-1]` where `input` is the input to the\nneural network). This is necessary to ensure the autoregressivity property.\n\n#### About Clipping\n\nThis function also optionally clips the `log_scale` (but possibly not its\ngradient). This is useful because if `log_scale` is too small/large it might\nunderflow/overflow making it impossible for the `MaskedAutoregressiveFlow`\nbijector to implement a bijection. Additionally, the `log_scale_clip_gradient`\n`bool` indicates whether the gradient should also be clipped. The default does\nnot clip the gradient; this is useful because it still provides gradient\ninformation (for fitting) yet solves the numerical stability problem. I.e.,\n`log_scale_clip_gradient = False` means\n`grad[exp(clip(x))] = grad[x] exp(clip(x))` rather than the usual\n`grad[clip(x)] exp(clip(x))`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `hidden_layers` | Python `list`-like of non-negative integer, scalars indicating the number of units in each hidden layer. Default: `[512, 512]. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`shift_only`\u003ca id=\"shift_only\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e Python`bool`indicating if only the`shift`term shall be computed. Default:`False`. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`activation`\u003ca id=\"activation\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e Activation function (callable). Explicitly setting to`None`implies a linear activation. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`log_scale_min_clip`\u003ca id=\"log_scale_min_clip\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e`float`-like scalar`Tensor`, or a`Tensor`with the same shape as`log_scale`. The minimum value to clip by. Default: -5. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`log_scale_max_clip`\u003ca id=\"log_scale_max_clip\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e`float`-like scalar`Tensor`, or a`Tensor`with the same shape as`log_scale`. The maximum value to clip by. Default: 3. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`log_scale_clip_gradient`\u003ca id=\"log_scale_clip_gradient\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e Python`bool`indicating that the gradient of \u003ca href=\"https://www.tensorflow.org/api_docs/python/tf/clip_by_value\"\u003e\u003ccode\u003etf.clip_by_value\u003c/code\u003e\u003c/a\u003e should be preserved. Default:`False`. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`name`\u003ca id=\"name\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e A name for ops managed by this function. Default: 'masked_autoregressive_default_template'. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`*args`\u003ca id=\"*args\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e`tf.layers.dense`arguments. \u003c/td\u003e \u003c/tr\u003e\u003ctr\u003e \u003ctd\u003e`* \\*kwargs`\u003ca id=\"**kwargs\"\u003e\u003c/a\u003e \u003c/td\u003e \u003ctd\u003e`tf.layers.dense\\` keyword arguments. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|-------------|--------------------------------------------------------------------------------------------|\n| `shift` | `Float`-like `Tensor` of shift terms (the 'mu' in \\[Germain et al. (2015)\\]\\[1\\]). |\n| `log_scale` | `Float`-like `Tensor` of log(scale) terms (the 'alpha' in \\[Germain et al. (2015)\\]\\[1\\]). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|-----------------------|-------------------------------------------------------------------------|\n| `NotImplementedError` | if rightmost dimension of `inputs` is unknown prior to graph execution. |\n\n\u003cbr /\u003e\n\n#### References\n\n\\[1\\]: Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. MADE:\nMasked Autoencoder for Distribution Estimation. In *International\nConference on Machine Learning* , 2015. \u003chttps://arxiv.org/abs/1502.03509\u003e"]]