Currently, only linear SVMs are supported. For the underlying optimization
problem, the SDCAOptimizer is used. For performance and convergence tuning,
the num_loss_partitions parameter passed to SDCAOptimizer (see __init__()
method), should be set to (#concurrent train ops per worker) x (#workers). If
num_loss_partitions is larger or equal to this value, convergence is
guaranteed but becomes slower as num_loss_partitions increases. If it is set
to a smaller value, the optimizer is more aggressive in reducing the global
loss but convergence is not guaranteed. The recommended value in an
Estimator (where there is one process per worker) is the number of workers
running the train steps. It defaults to 1 (single machine).
Input of fit and evaluate should have following features, otherwise there
will be a KeyError:
a feature with key=example_id_column whose value is a Tensor of dtype
string.
if weight_column_name is not None, a feature with
key=weight_column_name whose value is a Tensor.
for each column in feature_columns:
- if `column` is a `SparseColumn`, a feature with `key=column.name`
whose `value` is a `SparseTensor`.
- if `column` is a `RealValuedColumn, a feature with `key=column.name`
whose `value` is a `Tensor`.
Args
example_id_column
A string defining the feature column name representing
example ids. Used to initialize the underlying optimizer.
feature_columns
An iterable containing all the feature columns used by
the model. All items in the set should be instances of classes derived
from FeatureColumn.
weight_column_name
A string defining feature column name representing
weights. It is used to down weight or boost examples during training. It
will be multiplied by the loss of the example.
model_dir
Directory to save model parameters, graph and etc. This can
also be used to load checkpoints from the directory into a estimator to
continue training a previously saved model.
l1_regularization
L1-regularization parameter. Refers to global L1
regularization (across all examples).
l2_regularization
L2-regularization parameter. Refers to global L2
regularization (across all examples).
num_loss_partitions
number of partitions of the (global) loss function
optimized by the underlying optimizer (SDCAOptimizer).
kernels
A list of kernels for the SVM. Currently, no kernels are
supported. Reserved for future use for non-linear SVMs.
config
RunConfig object to configure the runtime settings.
feature_engineering_fn
Feature engineering function. Takes features and
labels which are the output of input_fn and
returns features and labels which will be fed
into the model.
Raises
ValueError
if kernels passed is not None.
Attributes
config
model_dir
Returns a path in which the eval process will look for checkpoints.
model_fn
Returns the model_fn which is bound to self.params.
Exports inference graph as a SavedModel into given dir.
Args
export_dir_base
A string containing a directory to write the exported
graph and checkpoints.
serving_input_fn
A function that takes no argument and
returns an InputFnOps.
default_output_alternative_key
the name of the head to serve when none is
specified. Not needed for single-headed models.
assets_extra
A dict specifying how to populate the assets.extra directory
within the exported SavedModel. Each key should give the destination
path (including the filename) relative to the assets.extra directory.
The corresponding value gives the full path of the source file to be
copied. For example, the simple case of copying a single file without
renaming it is specified as
{'my_asset_file.txt': '/path/to/my_asset_file.txt'}.
as_text
whether to write the SavedModel proto in text format.
checkpoint_path
The checkpoint path to export. If None (the default),
the most recent checkpoint found within the model directory is chosen.
graph_rewrite_specs
an iterable of GraphRewriteSpec. Each element will
produce a separate MetaGraphDef within the exported SavedModel, tagged
and rewritten as specified. Defaults to a single entry using the
default serving tag ("serve") and no rewriting.
strip_default_attrs
Boolean. If True, default-valued attributes will be
removed from the NodeDefs. For a detailed guide, see
Stripping Default-Valued
Attributes.
Incremental fit on a batch of samples. (deprecated arguments)
This method is expected to be called several times consecutively
on different or the same chunks of the dataset. This either can
implement iterative training or out-of-core/online training.
This is especially useful when the whole dataset is too big to
fit in memory at the same time. Or when model is taking long time
to converge, and you want to split up training into subparts.
Args
x
Matrix of shape [n_samples, n_features...]. Can be iterator that
returns arrays of features. The training input samples for fitting the
model. If set, input_fn must be None.
y
Vector or matrix [n_samples] or [n_samples, n_outputs]. Can be
iterator that returns array of labels. The training label values
(class labels in classification, real numbers in regression). If set,
input_fn must be None.
input_fn
Input function. If set, x, y, and batch_size must be
None.
steps
Number of steps for which to train model. If None, train forever.
batch_size
minibatch size to use on the input, defaults to first
dimension of x. Must be None if input_fn is provided.
monitors
List of BaseMonitor subclass instances. Used for callbacks
inside the training loop.
Returns
self, for chaining.
Raises
ValueError
If at least one of x and y is provided, and input_fn is
provided.
Returns predictions for given features. (deprecated arguments)
Args
x
Matrix of shape [n_samples, n_features...]. Can be iterator that
returns arrays of features. The training input samples for fitting the
model. If set, input_fn must be None.
input_fn
Input function. If set, x and 'batch_size' must be None.
batch_size
Override default batch size. If set, 'input_fn' must be
'None'.
outputs
list of str, name of the output to predict.
If None, returns all.
as_iterable
If True, return an iterable which keeps yielding predictions
for each example until inputs are exhausted. Note: The inputs must
terminate if you want the iterable to terminate (e.g. be sure to pass
num_epochs=1 if you are using something like read_batch_features).
iterate_batches
If True, yield the whole batch at once instead of
decomposing the batch into individual samples. Only relevant when
as_iterable is True.
Returns
A numpy array of predicted classes or regression values if the
constructor's model_fn returns a Tensor for predictions or a dict
of numpy arrays if model_fn returns a dict. Returns an iterable of
predictions if as_iterable is True.
The method works on simple estimators as well as on nested objects
(such as pipelines). The former have parameters of the form
<component>__<parameter> so that it's possible to update each
component of a nested object.