Activation function

In computational networks, the activation function of a node defines the output of that node given an input or set of inputs.

Types of activation functions

The following table lists activation functions implemented in stats++ that a function of a single fold $x$ from the previous layer(s):

Name	Equation	Derivative	Range
Logistic	$f(x)=\frac{1}{1+e^{-x}}$	$f'(x)=f(x)(1-f(x))$	$(0,1)$
tanh	$f(x)=\tanh(x)=\frac{2}{1+e^{-2x}}-1$	$f'(x)=1-f(x)^2$	$(-1,1)$
tanh (skewed)^[1]	$f(x)=1.7159\tanh(2x/3)$	$f'(x)=(1.7159*2/3)(1 - \tanh(2x/3)\tanh(2x/3))$	$(-1,1)$
Rectified linear unit (ReLU)^[2]	$f(x) = \left \{ \begin{array}{rcl} 0 & \mbox{for} & x < 0\\ x & \mbox{for} & x \ge 0\end{array} \right.$	$f'(x) = \left \{ \begin{array}{rcl} 0 & \mbox{for} & x < 0\\ 1 & \mbox{for} & x \ge 0\end{array} \right.$	$[0,\infty)$
SoftPlus^[3]	$f(x)=\ln(1+e^x)$	$f'(x)=\frac{1}{1+e^{-x}}$	$(0,\infty)$

The following table lists activation functions implemented in stats++ that are not functions of a single fold $x$ from the previous layer(s):

Name	Equation	Derivatives	Range
Softmax	$f(\mathbf{x})_i = \frac{e^{x_i}}{\sum_{k=1}^K e^{x_k}}$ for i = 1, …, K	$\frac{\partial f(\mathbf{x})_i}{\partial x_j} = f(\mathbf{x})_i(\delta_{ij} - f(\mathbf{x})_j)$	$(0,1)$

↑ Y. LeCun, L. Bottou, G. B. Orr, K.-R. M\"{u}ller, "Efficient BackProp," Neural Networks: Tricks of the Trade, in Lecture Notes in Computer Science 1524, 9--50 (2002)
↑ V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," Proceedings of the 27th International Conference on Machine Learning (ICML-10) (2010)
↑ X. Glorot, A. Bordes, and Y. Bengio, "Deep Sparse Rectifier Neural Networks," International Conference on Artificial Intelligence and Statistics (2011)

Name	Equation	Derivative	Range
Logistic	\(f(x)=\frac{1}{1+e^{-x}}\)	\(f'(x)=f(x)(1-f(x))\)	\((0,1)\)
tanh	\(f(x)=\tanh(x)=\frac{2}{1+e^{-2x}}-1\)	\(f'(x)=1-f(x)^2\)	\((-1,1)\)
tanh (skewed)^[1]	\(f(x)=1.7159\tanh(2x/3)\)	\(f'(x)=(1.7159*2/3)(1 - \tanh(2x/3)\tanh(2x/3))\)	\((-1,1)\)
Rectified linear unit (ReLU)^[2]	\(f(x) = \left \{ \begin{array}{rcl} 0 & \mbox{for} & x < 0\\ x & \mbox{for} & x \ge 0\end{array} \right.\)	\(f'(x) = \left \{ \begin{array}{rcl} 0 & \mbox{for} & x < 0\\ 1 & \mbox{for} & x \ge 0\end{array} \right.\)	\([0,\infty)\)
SoftPlus^[3]	\(f(x)=\ln(1+e^x)\)	\(f'(x)=\frac{1}{1+e^{-x}}\)	\((0,\infty)\)