- What is Neural Network?
- Dive into the neuron
- How does a neural network simulate an arbitrary function
- Why do we need neural networks

- How to construct a neural network
- Fully connected neural network
- Use graphical tool to design neural network
- The "activation function" of the output layer

- How to train a neural network
- Learning algorithm and principle
- Build and train neural networks from scratch
- Rewrite the code using PyTorch
- Use graphical tool to train neural network

- Some important problems of neural network
- Network structure
- Overfitting
- Underfitting
- Overfitting vs underfitting
- Initialization
- Vanishing gradient and exploding gradient

- Convolutional Neural Network(CNN)
- 1D-convolution
- 1D-convolution experiments
- 1D-pooling
- 1D-CNN experiments
- 2D-CNN
- 2D-CNN experiments

- Recurrent Neural Network(RNN)
- Vanilla RNN
- Seq2seq, Autoencoder, Encoder-Decoder
- Advanced RNN
- RNN classification experiment

- Natural language processing
- Embedding: Convert symbols to values
- Text Classification 1
- Text Classification 2
- TextCNN
- Entity recognition
- Word segmentation, POS tagging and chunking
- Sequence tagging in action
- Bidirectional RNN
- BI-LSTM-CRF
- Attention

- Language Models
- n-gram Model: Unigram
- n-gram Model: Bigram
- n-gram Model: Trigram
- RNN Language Model
- Transformer Language Model

- Linear Algebra
- Vector
- Matrix
- Dive in matrix multiplication
- Tensor

In the field of deep learning, we often see news about a certain model's ranking. A major breakthrough in a neural network algorithm task depends first on the dataset and second on the model structure.

The breakthrough in the image field, the ImageNet dataset is indispensable, this is the importance of the dataset, because it records the information of the objective function. But we learned from the previous section that the dataset is flawed, it cannot completely record the information of the objective function, and part of it will be lost. The quality of the dataset lies in its retention rate of objective function information.

The task of the training phase is to use the information in the dataset (dataset function $d(\mathbf x)$) to restore the objective function $o(\mathbf x)$. Due to various restrictions, we will only get a function $f(\mathbf x)$ that approximates the objective function.

A good model has a corresponding functional form closer to the objective function, which can better compensate for the defects of the dataset and obtain better results.

To quote the example in the previous section, the dataset has 2 points, and the information retention rate of this dataset is very low.

Make different assumptions about its functional form:

By adjusting the parameters of $w$, we can get countless parabolas, and all of them can perfectly simulate the dataset function $d(\mathbf x)$

There are many other functional forms, and the functional form itself is infinite. Other function forms can also perfectly simulate the dataset function $d(\mathbf x)$

With the increase of data points, the functional form is approaching to a straight line, but there are still infinite possibilities.

The information in the form of the objective function cannot be obtained from the data, and there are infinite possibilities.

The information in the dataset is insufficient. We need to obtain additional information from other places and use it to guide the structural design of the neural network to make up for the lack of information.

In various deep learning tasks, good models use highly specialized structures. like:

- Image task: 2-dimensional CNN
- Text tasks: Embedding, 1-dimensional CNN, RNN, CRF, Transformer, etc.

Many specialized structures are designed with reference to the processing process of a certain objective function. For example, CNN simulates the organizational structure of the optic nerve. Although the specific form of the objective function is unknown, people can often obtain some of its information and obtain better results by simulating the processing process of the objective function.

The structure of the neural network is the skeleton of the algorithm, which directly determines the ultimate potential of the algorithm. If the skeleton is not designed well, no matter how you train it, it will only be unsatisfactory in the end.

`import numpy as np def f(x): return x**2`

The structure of a neural network cannot be obtained through training, and usually requires artificial design.

There are also architecture searches through algorithms, that is, trying a variety of different architectures, and choosing the best one according to the final training effect, which requires a huge amount of computing power, and is only suitable for a very small number of institutions with strong financial resources.

To artificially design a good neural network structure, we need:

- Have a certain understanding or reasonable conjecture of the objective function corresponding to the task;
- Familiar with the common structure of neural network, and understand its principle;
- Use these structures to assemble into a structure similar to the objective function.

- The dataset cannot provide enough information, and the structure of the neural network needs to make up for the lack of information
- The information of the neural network structure comes from the understanding and simulation of the processing process of the objective function