BoostedTreesSparseCalculateBestFeatureSplit

public final class BoostedTreesSparseCalculateBestFeatureSplit

Calculates gains for each feature and returns the best possible split information for the feature.

The split information is the best threshold (bucket id), gains and left/right node contributions per node for each feature.

It is possible that not all nodes can be split on each feature. Hence, the list of possible nodes can differ between the features. Therefore, we return `node_ids_list` for each feature, containing the list of nodes that this feature can be used to split.

In this manner, the output is the best split per features and per node, so that it needs to be combined later to produce the best split for each node (among all possible features).

The output shapes are compatible in a way that the first dimension of all tensors are the same and equal to the number of possible split nodes for each feature.

Nested Classes

class BoostedTreesSparseCalculateBestFeatureSplit.Options Optional attributes for BoostedTreesSparseCalculateBestFeatureSplit

Public Methods

static BoostedTreesSparseCalculateBestFeatureSplit
create ( Scope scope, Operand <Integer> nodeIdRange, Operand <Integer> statsSummaryIndices, Operand <Float> statsSummaryValues, Operand <Integer> statsSummaryShape, Operand <Float> l1, Operand <Float> l2, Operand <Float> treeComplexity, Operand <Float> minNodeWeight, Long logitsDimension, Options... options)
Factory method to create a class wrapping a new BoostedTreesSparseCalculateBestFeatureSplit operation.
Output <Integer>
featureDimensions ()
A Rank 1 tensor indicating the best feature dimension for each feature to split for each node.
Output <Float>
gains ()
A Rank 1 tensor indicating the best gains to split each node.
Output <Float>
leftNodeContribs ()
A Rank 2 tensor indicating the contribution of the left nodes when branching from parent nodes to the left direction by the given threshold for each feature.
Output <Integer>
nodeIds ()
A Rank 1 tensor indicating possible node ids that can be split.
Output <Float>
rightNodeContribs ()
A Rank 2 tensor, with the same shape/conditions as left_node_contribs_list, but just that the value is for the right node.
static BoostedTreesSparseCalculateBestFeatureSplit.Options
splitType (String splitType)
Output <String>
splitWithDefaultDirections ()
A Rank 1 tensor indicating which direction to go if data is missing.
Output <Integer>
thresholds ()
A Rank 1 tensor indicating the bucket id to compare with (as a threshold) for split in each node.

Inherited Methods

Public Methods

public static BoostedTreesSparseCalculateBestFeatureSplit create ( Scope scope, Operand <Integer> nodeIdRange, Operand <Integer> statsSummaryIndices, Operand <Float> statsSummaryValues, Operand <Integer> statsSummaryShape, Operand <Float> l1, Operand <Float> l2, Operand <Float> treeComplexity, Operand <Float> minNodeWeight, Long logitsDimension, Options... options)

Factory method to create a class wrapping a new BoostedTreesSparseCalculateBestFeatureSplit operation.

Parameters
scope current scope
nodeIdRange A Rank 1 tensor (shape=[2]) to specify the range [first, last) of node ids to process within `stats_summary_list`. The nodes are iterated between the two nodes specified by the tensor, as like `for node_id in range(node_id_range[0], node_id_range[1])` (Note that the last index node_id_range[1] is exclusive).
statsSummaryIndices A Rank 2 int64 tensor of dense shape [N, 4] (N specifies the number of non-zero values) for accumulated stats summary (gradient/hessian) per node per bucket for each feature. The second dimension contains node id, feature dimension, bucket id, and stats dim. stats dim is the sum of logits dimension and hessian dimension, hessian dimension can either be logits dimension if diagonal hessian is used, or logits dimension^2 if full hessian is used.
statsSummaryValues A Rank 1 float tensor of dense shape [N] (N specifies the number of non-zero values), which supplies the values for each element in summary_indices.
statsSummaryShape A Rank 1 float tensor of dense shape [4], which specifies the dense shape of the sparse tensor, which is [num tree nodes, feature dimensions, num buckets, stats dim].
l1 l1 regularization factor on leaf weights, per instance based.
l2 l2 regularization factor on leaf weights, per instance based.
treeComplexity adjustment to the gain, per leaf based.
minNodeWeight minimum avg of hessians in a node before required for the node to be considered for splitting.
logitsDimension The dimension of logit, i.e., number of classes.
options carries optional attributes values
Returns
  • a new instance of BoostedTreesSparseCalculateBestFeatureSplit

public Output <Integer> featureDimensions ()

A Rank 1 tensor indicating the best feature dimension for each feature to split for each node.

public Output <Float> gains ()

A Rank 1 tensor indicating the best gains to split each node.

public Output <Float> leftNodeContribs ()

A Rank 2 tensor indicating the contribution of the left nodes when branching from parent nodes to the left direction by the given threshold for each feature. This value will be used to make the left node value by adding to the parent node value. Second dimension size is logits dimension.

public Output <Integer> nodeIds ()

A Rank 1 tensor indicating possible node ids that can be split.

public Output <Float> rightNodeContribs ()

A Rank 2 tensor, with the same shape/conditions as left_node_contribs_list, but just that the value is for the right node.

public static BoostedTreesSparseCalculateBestFeatureSplit.Options splitType (String splitType)

Parameters
splitType A string indicating if this Op should perform inequality split or equality split.

public Output <String> splitWithDefaultDirections ()

A Rank 1 tensor indicating which direction to go if data is missing. Inequality with default left returns 0, inequality with default right returns 1, equality with default right returns 2.

public Output <Integer> thresholds ()

A Rank 1 tensor indicating the bucket id to compare with (as a threshold) for split in each node.