From the previous section (What is Neural Network), we learned that a neural network is a function, which is composed of neurons, and neuron is also a function.
Neuron can continue to be split into 2 sub-functions:
n element linear function: g(x1,...,xn)
unary non-linear function: h(x)
The function represented by the neuron is:
f(x1,...,xn)=h(g(x1,...,xn))
Linear function g(x1,...,xn)
The linear function has the following form:
g(x1,...,xn)=w1x1+...,wnxn+b
Among them, w1,...,wn,b are all parameters, and different linear functions have different parameters.
Unary linear function
When n=1, g(x1)=w1x1+b, the function image is a straight line:
Binary linear function
When n=2, g(x1,x2)=w1x1+w2x2+b, the function image is a plane:
n element linear function
When n>2, the function image is a hyperplane. Beyond 3D, visualization is not convenient. But you can imagine that its characteristic is straight.
Non-linear function h(x)
It is easy to understand from the name that a non-linear function is a function different from a linear function. A linear function is straight, and a non-linear function is curved. Such as the most common Sigmoid function:
Activation function
In neural networks, we call this unary non-linear function activation function. For some common activation functions, please refer to activation function in the knowledge base, where:
Linear: $ f(x) = x $ is a linear function, which means that a non-linear function is not used
Softmax is a special case. Strictly speaking, it is not an activation function
Necessity
Why should a nonlinear activation function be followed by a linear function?
This is because:
If neurons are all linear functions, then the neural network composed of neurons is also a linear function.
Such as the following example:
f1(x,y)=w1x+w2y+b1
f2(x,y)=w3x+w4y+b2
f3(x,y)=w5x+w6y+b3
Then the function represented by the entire neural network is:
The objective function we need to construct contains various functions, and the linear function is just one of them.
We hope that neural networks can simulate arbitrary functions, not just linear functions. So we added a non-linear activation function and "bended" the linear function.
Complete neuron
The complete neuron combines a linear function and a non-linear activation function, making it more interesting and powerful.
Unary function
When $ n = 1 $, $ g(x_1) = w_1x_1 + b $, using Sigmoid activation function, the neuron's corresponding function is:
h(g(x))=sigmoid(wx+b)
The function image is:
Binary function
When n=2, g(x1,x2)=w1x1+w2x2+b, using Sigmoidactivation function, the neuron's corresponding function is:
h(g(x))=sigmoid(w1x1+w2x2+b)
The function image is:
n-element function
Due to the visualization problem, it is entirely up to my own imagination here! 😥
Question
Why the neural network can simulate complex functions from combination of neurons ?
You can intuitively imagine how to simulate a slightly more complicated function through simple neurons.