Master Activation Functions in 5 Minutes: The Only Activation Functions Guide You'll Ever Need

Activation functions enable neural networks to learn complex patterns in data. They function similarly to neurons in the human brain, determining which signals to pass forward.

Data can be classified into two main types:

Linear
Non-Linear

Linear Data

Linear data, as its name implies, forms a straight line when plotted on a graph. The Y values increase proportionally with the X values. To illustrate this, here's a simple dataset representing linear data—the type commonly found in regression problems.

Data Points:

X = [1, 2, 3, 4, 5]

Y = [5, 6, 7, 8, 9]

When we plot these points, we get the following graph.

Graph Representing Linear Data — Linear Data Example

Non-Linear Data

Let's look at an example of non-linear data, specifically circular data. The following plot uses the make_circles module from sklearn, which creates a large circle containing a smaller circle in 2D. Non-linear datasets are commonly found in classification problems.

These datasets require activation functions to learn their patterns. While linear activation functions work only with linear data, they fail with non-linear data. We must use a combination of ReLU with either sigmoid (for binary classification) or SoftMax (for multiclass classification) for non-linear data.

Graph Representing Non Linear Data — Non-Linear Data Example

Let's explore three key activation functions:

Linear
ReLU
Sigmoid

To demonstrate how these functions behave, we'll test them using this input array:

X = [-20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

Linear (pass-through) Activation Functions

Code: tf.keras.activations.linear

Input: x

Output: x

In essence, a linear function acts as a pass-through—it simply returns the input value as its output without any transformation.

Linear Activation Function Graph — Linear Activation

ReLU Activation Functions

Code: tf.keras.activations.relu

Input: x

Output: max(0, x)

For any input value, it returns the positive number as-is, while converting negative numbers to 0.

ReLU activation function graph — ReLU Activation

Sigmoid Activation Functions

Code: tf.keras.activations.sigmoid

Input: x

Output: (1 / (1 + exp(-x)))

The sigmoid function is essential for nonlinear data, as it can generate nonlinear output. That's why we typically use it in combination with ReLU.

Sigmoid Activation Function Graph — Sigmoid Activation

We need an activation function that can produce nonlinear output for nonlinear data. This is why we typically combine ReLU with sigmoid.

Source Code: jainamshroff/MediumActivationFunctions: Source Code for article published on Medium about activation functions. (github.com )

Master Activation Functions in 5 Minutes: The Only Activation Functions Guide You'll Ever Need

Linear Data

Non-Linear Data

Linear (pass-through) Activation Functions

ReLU Activation Functions

Sigmoid Activation Functions

Recent Posts

Comments