Whether y_pred is expected to be a logits tensor. By
default, we assume that y_pred encodes a probability distribution.
label_smoothing
Float in [0, 1]. If > 0 then smooth the labels. For
example, if 0.1, use 0.1 / num_classes for non-target labels
and 0.9 + 0.1 / num_classes for target labels.
axis
Defaults to -1. The dimension along which the entropy is
computed.