# 1. 神经网络结构

## 1.1. Multi-Layer Neural Network}

A.3-layer network: Input Layer,Hidden Lyer,Output layer\ Except input units,each unit has a bias.

## 1.2. preassumption calculation}

Hidden Layer: Specifically, a signal $x{i}$ at the input of synapse $i$ connected to nueron $j$ us multiplied by the synaptic weight $w{ji}$\ $i$ refers input layer,$j$ refers hidden layer.$w{j0}$ is the bias.$x{0}=+1$\

• Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possibly nonlinear activation link.The bias is represented by a synaptic link connected to an input fixed at $+1$.
• The synaptic links of a neuron weight their respective input signals.
• The weighted sum of the input signals defines the induced local field of the neuron in question.
• The activation link squashes the induced local field of the neuron to produce an output.

Output layer: $f()$ is the \emph{activation function}.It defines the output of a neuron in terms of the induced local field $net$ .

For example:

is the number of hidden layers.\

So:

The activate function of output layer can be different from hidden layer while each unit can have different activate function.

## 1.3. BP Algorithm

The popularity of on-line learning for the supervised training of multilayer perceptrons has been further enhanced by the development of the back-propagation algorithm. Backpropagation, an abbreviation for "backward propagation of errors",is the easiest way of supervised training.We need to generate output activations of each hidden layer.\ The partial derivative $\partial J /\partial w_{ji}$ represents a sensitivity factor, determining the direction of search in weight space for the synaptic weight Learning:

the instantaneous error energy of neuron $j$ is defined by

In the batch method of supervised learning, adjustments to the synaptic weights of the multilayer perceptron are performed \emph{after} the presentation of all the %N% examples in the training sample $\mathcal T$ that constitute one \emph{epoch} of training. In other words, the cost function for batch learning is defined by the average error energy $J(w)$.

• firstly define the training bias of output layer:

• input->hidden

## 1.7. 通过时间的反向传播（BPTT：Backpropagation Through Time）

• 隐层第个神经元的阈值用表示，
• 输出层第个神经元的阈值用表示。
• 输入层第个神经元与隐层第个神经元之间的连接权为;
• 隐层第个神经元与输出层第个神经元之间的连接权为;
• 记隐层第个神经元接收到的输入为;
• 输出层第个神经元接收到的输入为, 其中为隐层第个神经元的输出;
• 假设隐层和输出层神经元都使用Logistic函数：

## 1.8. Siamese Networks

One Shot Learning with Siamese Networks in PyTorch