日本財団図書館（電子図書館）　Recent Advances in Marine Science and Technology, 2002

NEURAL NETWORK APPROACH

The neural network approach, which has been developed in recently, is based on the concept of learning activities, such that, the procedure itself evolves an optimal model relating key variables.

（Enlarge: 17KB）

（Enlarge: 31KB）

Figure 5. The architecture of a typical back-propagation neural network: The output cell of perception (a), sigmoid function (b), and simple 3-layer neural network (c).

Neural network computing differs from artificial intelligence and traditional computing in several important ways. Unlike traditional expert systems where the knowledge is made explicit in the form of rules, neural networks generate their own rules by learning from the examples shown to them. Learning is activated through a learning rule, which adapts or changes the connection weights of the network in response to the example inputs and the desired outputs to those inputs. Learning in neural networks refers to the processes of acquiring a desired behavior by changing the connection weights. Therefore, the advantage of neural networks is that no subjective information is required to determine the model structure or estimate parameters. Thus in analyzing large data sets with no a priori knowledge of processes or causality the neural network is more a pragmatic approach. However, in terms of representing processes neural networks will be severely limited. Only generalized process statements could be made from the neural network models and for a process interpretation of algal growth behavior the dynamic mass balance and growth equation approach will always be required. Whilst these models are essential to understand process interactions their predictive capability is not any better than the neural network approach. We propose here a red tide prediction model based on one of the principles of the neural network: the back-propagation algorithm (Rumelhart et al., 1986; Abdi, 1994).

The neural network was developed as a three-layer learning network, consisting of an input layer, a hidden layer and an output layer (Fig.5c). Each layer is made up of several nodes, and the layers are connected by sets of connection weights. The nodes receive input from either outside the model (the initial inputs) or from the connections. Nodes operate on the input transforming it to produce an analogue output. The weights function to multiply an incoming signal prior to its arrival at the next layer. The transformation associated with each node is a sigmoid function (Fig.5b).

NEURAL NETWORK ALGORITHM

Here, θ_k is the bias value and W_ij are connection weights between the input layer and hidden layer. U_j, which is called the normalized logistic function between [0,1], is the output of response function f and transfers H_j to next hidden layer as follows

a is sigmoid shape parameter and changed gradient of sigmoid (Fig.5-b). The sigmoid function is a differentiable function defined as follows:

f'(U_j)=(1/a)f(U_j){1-f(U_j)} (3)

The relations among O_k (output from output unit), H_j (output from hidden layer) and Z_jk (connecting weights between hidden and output layer) are given by

O_k=f(X_k), X_k=ΣZ_jkH_j-θ_k (4)

The connection weights are modified step by step to minimize the mean square error between the observed value and the predicted value. This process is called learning procedure and use a gradient-descent method in order to minimize the mean square error (Rumelhart et al., 1986).

The sum of the square errors, E_p between the calculated output O_k and the observed data T_k, is described by

E_p is minimized by a gradient-descent method as follows:

The errors between calculated and observed data are

σ^k=T_k-O_k (7)

used the differentiable sigmoid function, we can rewrite equation (6) as follows:

To avoid the oscillation at learning rate μ, is to make the change in weight dependent of the past weight change by adding a momentum term (α).

ΔZ_jk=μδ_kH_j+αΔZ_jk (10)

The differential of E_p with respect to W_ij is calculated as follows