TensorFlow 1 version View source on GitHub

Scaled Exponential Linear Unit (SELU).

The Scaled Exponential Linear Unit (SELU) activation function is: scale * x if x > 0 and scale * alpha * (exp(x) - 1) if x < 0 where alpha and scale are pre-defined constants (alpha = 1.67326324 and scale = 1.05070098). The SELU activation function multiplies scale > 1 with the [elu]( (Exponential Linear Unit (ELU)) to ensure a slope larger than one for positive net inputs.

The values of alpha and scale are chosen so that the mean and variance of the inputs are preserved between two consecutive layers as long as the weights are initialized correctly (see lecun_normal initialization) and the number of inputs is "large enough" (see references for more information).

(Courtesy: Blog on Towards DataScience at

Example Usage:

n_classes = 10  #10_class problem
from tensorflow.python.keras.layers import Dense
model = tf.keras.Sequential()
model.add(Dense(64, kernel_initializer='lecun_normal',
                activation='selu', input_shape=(28, 28, 1)))
model.add(Dense(32, kernel_initializer='lecun_normal',
model.add(Dense(16, kernel_initializer='lecun_normal',
model.add(Dense(n_classes, activation='softmax'))

x A tensor or variable to compute the activation function for.

The scaled exponential unit activation: scale * elu(x, alpha).


- To be used together with the initialization "[lecun_normal]
- To be used together with the dropout variant "[AlphaDropout]


Self-Normalizing Neural Networks (Klambauer et al, 2017)