Runs multiple Fisher scoring steps.
tfp.glm.fit(
    model_matrix,
    response,
    model,
    model_coefficients_start=None,
    predicted_linear_response_start=None,
    l2_regularizer=None,
    dispersion=None,
    offset=None,
    convergence_criteria_fn=None,
    learning_rate=None,
    fast_unsafe_numerics=True,
    maximum_iterations=None,
    l2_regularization_penalty_factor=None,
    name=None
)
Used in the notebooks
| Args | 
|---|
| model_matrix | (Batch of) float-like, matrix-shapedTensorwhere each row
represents a sample's features. | 
| response | (Batch of) vector-shaped Tensorwhere each element represents a
sample's observed response (to the corresponding row of features). Must
have samedtypeasmodel_matrix. | 
| model | tfp.glm.ExponentialFamily-like instance which implicitly
characterizes a negative log-likelihood loss by specifying the
distribuion'smean,gradient_mean, andvariance. | 
| model_coefficients_start | Optional (batch of) vector-shaped Tensorrepresenting the initial model coefficients, one for each column inmodel_matrix. Must have samedtypeasmodel_matrix.
Default value: Zeros. | 
| predicted_linear_response_start | Optional Tensorwithshape,dtypematchingresponse; representsoffsetshifted initial linear
predictions based onmodel_coefficients_start.
Default value:offsetifmodel_coefficients is None, andtf.linalg.matvec(model_matrix, model_coefficients_start) + offsetotherwise. | 
| l2_regularizer | Optional scalar Tensorrepresenting L2 regularization
penalty, i.e.,loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w||_2^2.
Default value:None(i.e., no L2 regularization). | 
| dispersion | Optional (batch of) Tensorrepresentingresponsedispersion,
i.e., as in,p(y|theta) := exp((y theta - A(theta)) / dispersion).
Must broadcast with rows ofmodel_matrix.
Default value:None(i.e., "no dispersion"). | 
| offset | Optional Tensorrepresenting constant shift applied topredicted_linear_response.  Must broadcast toresponse.
Default value:None(i.e.,tf.zeros_like(response)). | 
| convergence_criteria_fn | Python callabletaking:is_converged_previous,iter_,model_coefficients_previous,predicted_linear_response_previous,model_coefficients_next,predicted_linear_response_next,response,model,dispersionand
returning aboolTensorindicating that Fisher scoring has converged.
Seeconvergence_criteria_small_relative_norm_weights_changeas an
example function.
Default value:None(i.e.,convergence_criteria_small_relative_norm_weights_change). | 
| learning_rate | Optional (batch of) scalar Tensorused to dampen iterative
progress. Typically only needed if optimization diverges, should be no
larger than1and typically very close to1.
Default value:None(i.e.,1). | 
| fast_unsafe_numerics | Optional Python boolindicating if faster, less
numerically accurate methods can be employed for computing the weighted
least-squares solution.
Default value:True(i.e., "fast but possibly diminished accuracy"). | 
| maximum_iterations | Optional maximum number of iterations of Fisher scoring
to run; "and-ed" with result of convergence_criteria_fn.
Default value:None(i.e.,infinity). | 
| l2_regularization_penalty_factor | Optional (batch of) vector-shaped Tensor, representing a separate penalty factor to apply to each model
coefficient, length equal to columns inmodel_matrix. Each penalty
factor multiplies l2_regularizer to allow differential regularization. Can
be 0 for some coefficients, which implies no regularization. Default is 1
for all coefficients.loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w *
  l2_regularization_penalty_factor||_2^2Default value:None(i.e., no per coefficient regularization). | 
| name | Python strused as name prefix to ops created by this function.
Default value:"fit". | 
| Returns | 
|---|
| model_coefficients | (Batch of) vector-shaped Tensor; represents the
fitted model coefficients, one for each column inmodel_matrix. | 
| predicted_linear_response | response-shapedTensorrepresenting linear
predictions based on newmodel_coefficients, i.e.,tf.linalg.matvec(model_matrix, model_coefficients) + offset. | 
| is_converged | boolTensorindicating that the returnedmodel_coefficientsmet theconvergence_criteria_fncriteria within themaximum_iterationslimit. | 
| iter_ | int32Tensorindicating the number of iterations taken. | 
Example
  import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
def make_dataset(n, d, link, scale=1., dtype=np.float32):
  model_coefficients = tfd.Uniform(
      low=np.array(-1, dtype),
      high=np.array(1, dtype)).sample(d, seed=42)
  radius = np.sqrt(2.)
  model_coefficients *= radius / tf.linalg.norm(model_coefficients)
  model_matrix = tfd.Normal(
      loc=np.array(0, dtype),
      scale=np.array(1, dtype)).sample([n, d], seed=43)
  scale = tf.convert_to_tensor(scale, dtype)
  linear_response = tf.tensordot(
      model_matrix, model_coefficients, axes=[[1], [0]])
  if link == 'linear':
    response = tfd.Normal(loc=linear_response, scale=scale).sample(seed=44)
  elif link == 'probit':
    response = tf.cast(
        tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) > 0,
        dtype)
  elif link == 'logit':
    response = tfd.Bernoulli(logits=linear_response).sample(seed=44)
  else:
    raise ValueError('unrecognized true link: {}'.format(link))
  return model_matrix, response, model_coefficients
X, Y, w_true = make_dataset(n=int(1e6), d=100, link='probit')
w, linear_response, is_converged, num_iter = tfp.glm.fit(
    model_matrix=X,
    response=Y,
    model=tfp.glm.BernoulliNormalCDF())
log_likelihood = tfp.glm.BernoulliNormalCDF().log_prob(Y, linear_response)
print('is_converged: ', is_converged.numpy())
print('    num_iter: ', num_iter.numpy())
print('    accuracy: ', np.mean((linear_response > 0.) == tf.cast(Y, bool)))
print('    deviance: ', 2. * np.mean(log_likelihood))
print('||w0-w1||_2 / (1+||w0||_2): ', (np.linalg.norm(w_true - w, ord=2) /
                                       (1. + np.linalg.norm(w_true, ord=2))))
# ==>
# is_converged:  True
#     num_iter:  6
#     accuracy:  0.804382
#     deviance:  -0.820746600628
# ||w0-w1||_2 / (1+||w0||_2):  0.00619245105309