Skip to content

Neural Networks and Deep Learning. Chapter 2: The Neuron

Share on twitter
Share on linkedin
Share on email
Share on whatsapp
Neural Networks and Deep Learning - The Neuron

Etymology and definition

In general, the etymology of the neural network, as we saw in the previous chapter "Neural Networks and Deep Learning. Chapter 1: Prelude", is inspired by the biological neural networks that are part of the human brain. In this sense, the parallelism is direct between the neurons of an artificial neural network and those of a biological network. The former are connected to each other by means of links, in the same way that the latter are connected by means of synapses.

As we learned earlier, a neural network is based on the interaction of many simple parts working together to obtain a result that can be more or less abstract depending on the complexity of the network. It is each of these simple parts that we call a neuron. In other words, a neuron is the basic unit of information in a neural network. Next we will talk about the role that the neuron plays within the network and we will introduce a fundamental notion such as the activation function.

Etymology of the neural network

Role of the neuron within the neural network

From a very basic approach, the functioning of a neuron is based on receiving some numerical input values, combining them, and returning a numerical output value. So, in the first instance, we can see a neuron as a mathematical function as follows:

Where:

  • y is the output value.
  • x1,...,xn are the input values.
  • w1,...,wn are called weights or parameters, and they measure the influence of each input value on the final result.
  • b intercept or bias is an additional parameter so that the function is not conditioned to pass through the origin, thus allowing a better adjustment to reality.

At this point, we may think that a neuron is exactly the same as a linear regression. This is indeed true, and the behavior is analogous at the moment. However, there is an element that plays a fundamental role in artificial neurons but is not present in a linear regression. This is the activation function and we will talk about it in the next section.

Neuron activation function

To highlight the importance of the activation functions, let's look at two examples below.

First let's suppose we have the distribution of dots in the following image and we want to separate the reds from the blues.

Neuron activation

As we can see in the image below, it is easy to find one of the infinite functions that allow us to separate these two groups linearly. Therefore, a single neuron would be enough in this case to perform the classification of our observations. This problem is known in logic as an AND or OR gate.

AND gate or OR gate
AND gate or OR gate

Well, the next step would be to assume the distribution we see below. We can take the image on the left and try to separate the points into blue and red with a single linear function, but it will be impossible. Instead, as we see on the right, by combining two functions it is easy to do the separation. That is why we say that by combining two neurons we can solve what is known in logic as an XOR gate.

XOR door
XOR Gate

Finally, let's suppose that the points are now distributed in the latter way.

We can try, but we will never find a linear function that separates both classes. In the same way we will never find a combination of them as we did in the previous step. In this problem we need to use nonlinear functions as we can see in the image on the right.

Linear and nonlinear functions

In order to be able to construct this type of frontier we need to make use of activation functions. The use of these functions is summarized in distorting the output value of the weighted sum, in such a way that going back to the formula we saw before we can now see a neuron like

Where the function f is our activation function. In fact, thanks to activation functions we have the following theorem which was first proved by Moshe Leshno in 1993 and which is of vital importance in this field.

Universal approximation theorem: Under a number of assumptions, any continuous function f can be modeled by means of a neural network consisting of a hidden layer and a sufficient number of neurons.

Activation function: State of the art.

Below we will present the most used functions, as well as a short description of each one of them. 

Activation functions
Activation functions
  • Linear function. This is the identity function. This function makes the output equal to the input. If the activation function is this, then that neuron behaves exactly like a linear regression.
  • Step function. Useful when the output is categorical and it is intended to classify. However, in practice it is not widely used because the step makes it difficult to work with its derivative.
  • Sigmoid function. It is very interesting because thanks to it the very big values converge to 1 and the very small values to -1, so it is useful to represent probabilities. It has a disadvantage and it is that its range is not centered in the origin. However, this last problem can be solved using the hyperbolic tangent function, which is very similar.
  • ReLU (linear rectified unit) function. This is one of the most commonly used activation functions. It behaves as a constant function for negative values and as a linear function for positive values.

By adding these non-linear deformations the problem of chaining several neurons is solved. For example, by combining a sufficient number of neurons with sigmoid function we can obtain surfaces like the following:

Various functions
On the left 3D sigmoid function and on the right a combination of several such functions.

We can observe that the intersection of this second surface with the plane generates the necessary boundary to make the classification of the points with concentric circle distribution that we saw before.

Now that we have seen the potential of neural networks and how neurons work, we are ready to see what mechanisms they use to automate learning. But that will be in the next chapter!

Share the article

Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on email
Email
Share on whatsapp
WhatsApp

A new generation of technological services and products for our customers