Neural Network

English

What is Neural Network?
1. Dive into the neuron
2. How does a neural network simulate an arbitrary function
3. Why do we need neural networks
How to construct a neural network
1. Fully connected neural network
2. Use graphical tool to design neural network
3. The "activation function" of the output layer
How to train a neural network
1. Learning algorithm and principle
2. Build and train neural networks from scratch
3. Rewrite the code using PyTorch
4. Use graphical tool to train neural network
Some important problems of neural network
1. Network structure
2. Overfitting
3. Underfitting
4. Overfitting vs underfitting
5. Initialization
6. Vanishing gradient and exploding gradient
Convolutional Neural Network(CNN)
1. 1D-convolution
2. 1D-convolution experiments
3. 1D-pooling
4. 1D-CNN experiments
5. 2D-CNN
6. 2D-CNN experiments
Recurrent Neural Network(RNN)
1. Vanilla RNN
2. Seq2seq, Autoencoder, Encoder-Decoder
3. Advanced RNN
4. RNN classification experiment
Natural language processing
1. Embedding: Convert symbols to values
2. Text Classification 1
3. Text Classification 2
4. TextCNN
5. Entity recognition
6. Word segmentation, POS tagging and chunking
7. Sequence tagging in action
8. Bidirectional RNN
9. BI-LSTM-CRF
10. Attention
Language Models
1. n-gram Model: Unigram
2. n-gram Model: Bigram
3. n-gram Model: Trigram
4. RNN Language Model
5. Transformer Language Model
Linear Algebra
1. Vector
2. Matrix
3. Dive in matrix multiplication
4. Tensor

Network structure

Overview

In the field of deep learning, we often see news about a certain model's ranking. A major breakthrough in a neural network algorithm task depends first on the dataset and second on the model structure.

The breakthrough in the image field, the ImageNet dataset is indispensable, this is the importance of the dataset, because it records the information of the objective function. But we learned from the previous section that the dataset is flawed, it cannot completely record the information of the objective function, and part of it will be lost. The quality of the dataset lies in its retention rate of objective function information.

The task of the training phase is to use the information in the dataset (dataset function $d(\mathbf x)$ ) to restore the objective function $o(\mathbf x)$ . Due to various restrictions, we will only get a function $f(\mathbf x)$ that approximates the objective function.

A good model has a corresponding functional form closer to the objective function, which can better compensate for the defects of the dataset and obtain better results.

Visual display

2 points

To quote the example in the previous section, the dataset has 2 points, and the information retention rate of this dataset is very low.


Dataset

Make different assumptions about its functional form:


Straight line form


 $w$ 3
Parabolic form

By adjusting the parameters of $w$ , we can get countless parabolas, and all of them can perfectly simulate the dataset function $d(\mathbf x)$

There are many other functional forms, and the functional form itself is infinite. Other function forms can also perfectly simulate the dataset function $d(\mathbf x)$

3 points


Straight line form


 $w$ 3
Parabolic form

5 points


Straight line form


 $w$ 25
Parabolic form

With the increase of data points, the functional form is approaching to a straight line, but there are still infinite possibilities.

Summary

The information in the form of the objective function cannot be obtained from the data, and there are infinite possibilities.

Design the structure of the neural network

The information in the dataset is insufficient. We need to obtain additional information from other places and use it to guide the structural design of the neural network to make up for the lack of information.

Special structure

In various deep learning tasks, good models use highly specialized structures. like:

Image task: 2-dimensional CNN
Text tasks: Embedding, 1-dimensional CNN, RNN, CRF, Transformer, etc.

Many specialized structures are designed with reference to the processing process of a certain objective function. For example, CNN simulates the organizational structure of the optic nerve. Although the specific form of the objective function is unknown, people can often obtain some of its information and obtain better results by simulating the processing process of the objective function.

Structural Design

The structure of the neural network is the skeleton of the algorithm, which directly determines the ultimate potential of the algorithm. If the skeleton is not designed well, no matter how you train it, it will only be unsatisfactory in the end.


 $w$ 1
 $b$ 2
Inappropriate structure


 $a$ 1
 $b$ 0
 $c$ 0
Appropriate structure

import numpy as np

def f(x):
    return x**2

The structure of a neural network cannot be obtained through training, and usually requires artificial design.

There are also architecture searches through algorithms, that is, trying a variety of different architectures, and choosing the best one according to the final training effect, which requires a huge amount of computing power, and is only suitable for a very small number of institutions with strong financial resources.

To artificially design a good neural network structure, we need:

Have a certain understanding or reasonable conjecture of the objective function corresponding to the task;
Familiar with the common structure of neural network, and understand its principle;
Use these structures to assemble into a structure similar to the objective function.

Summary

The dataset cannot provide enough information, and the structure of the neural network needs to make up for the lack of information
The information of the neural network structure comes from the understanding and simulation of the processing process of the objective function

Network structure

Overview

Visual display

2 points

3 points

5 points

Summary

Design the structure of the neural network

Special structure

Structural Design

Summary

Problem

Neural networks can simulate arbitrary functions, so why we need to design its structure?