Network structure

Overview

In the field of deep learning, we often see news about a certain model's ranking. A major breakthrough in a neural network algorithm task depends first on the dataset and second on the model structure.


The breakthrough in the image field, the ImageNet dataset is indispensable, this is the importance of the dataset, because it records the information of the objective function. But we learned from the previous section that the dataset is flawed, it cannot completely record the information of the objective function, and part of it will be lost. The quality of the dataset lies in its retention rate of objective function information.


The task of the training phase is to use the information in the dataset (dataset function d(x)d(\mathbf x)) to restore the objective function o(x)o(\mathbf x). Due to various restrictions, we will only get a function f(x)f(\mathbf x) that approximates the objective function.


A good model has a corresponding functional form closer to the objective function, which can better compensate for the defects of the dataset and obtain better results.

Visual display

2 points

To quote the example in the previous section, the dataset has 2 points, and the information retention rate of this dataset is very low.

Dataset

Make different assumptions about its functional form:

Straight line form
ww3
Parabolic form

By adjusting the parameters of ww, we can get countless parabolas, and all of them can perfectly simulate the dataset function d(x)d(\mathbf x)


There are many other functional forms, and the functional form itself is infinite. Other function forms can also perfectly simulate the dataset function d(x)d(\mathbf x)

3 points

Straight line form
ww3
Parabolic form

5 points

Straight line form
ww25
Parabolic form

With the increase of data points, the functional form is approaching to a straight line, but there are still infinite possibilities.

Summary

The information in the form of the objective function cannot be obtained from the data, and there are infinite possibilities.

Design the structure of the neural network

The information in the dataset is insufficient. We need to obtain additional information from other places and use it to guide the structural design of the neural network to make up for the lack of information.

Special structure

In various deep learning tasks, good models use highly specialized structures. like:


Many specialized structures are designed with reference to the processing process of a certain objective function. For example, CNN simulates the organizational structure of the optic nerve. Although the specific form of the objective function is unknown, people can often obtain some of its information and obtain better results by simulating the processing process of the objective function.

Structural Design

The structure of the neural network is the skeleton of the algorithm, which directly determines the ultimate potential of the algorithm. If the skeleton is not designed well, no matter how you train it, it will only be unsatisfactory in the end.

ww1
bb2
Inappropriate structure
aa1
bb0
cc0
Appropriate structure
import numpy as np def f(x): return x**2

The structure of a neural network cannot be obtained through training, and usually requires artificial design.


There are also architecture searches through algorithms, that is, trying a variety of different architectures, and choosing the best one according to the final training effect, which requires a huge amount of computing power, and is only suitable for a very small number of institutions with strong financial resources.


To artificially design a good neural network structure, we need:

  1. Have a certain understanding or reasonable conjecture of the objective function corresponding to the task;
  2. Familiar with the common structure of neural network, and understand its principle;
  3. Use these structures to assemble into a structure similar to the objective function.

Summary

  1. The dataset cannot provide enough information, and the structure of the neural network needs to make up for the lack of information
  2. The information of the neural network structure comes from the understanding and simulation of the processing process of the objective function

Problem

Neural networks can simulate arbitrary functions, so why we need to design its structure?