NEURAL NETWORK APPROACH
The neural network approach, which has been developed in recently, is based on the concept of learning activities, such that, the procedure itself evolves an optimal model relating key variables.
（Enlarge: 17KB） 

（Enlarge: 31KB） 

Figure 5. The architecture of a typical backpropagation neural network:
The output cell of perception (a), sigmoid function (b), and simple 3layer neural network (c).
Neural network computing differs from artificial intelligence and traditional
computing in several important ways. Unlike traditional expert systems where the knowledge is made explicit
in the form of rules, neural networks generate their own rules by learning from the examples shown to
them. Learning is activated through a learning rule, which adapts or changes the connection weights of
the network in response to the example inputs and the desired outputs to those inputs. Learning in neural
networks refers to the processes of acquiring a desired behavior by changing the connection weights. Therefore,
the advantage of neural networks is that no subjective information is required to determine the model
structure or estimate parameters. Thus in analyzing large data sets with no a priori knowledge of processes
or causality the neural network is more a pragmatic approach. However, in terms of representing processes
neural networks will be severely limited. Only generalized process statements could be made from the neural
network models and for a process interpretation of algal growth behavior the dynamic mass balance and
growth equation approach will always be required. Whilst these models are essential to understand process
interactions their predictive capability is not any better than the neural network approach. We propose
here a red tide prediction model based on one of the principles of the neural network: the backpropagation
algorithm ( Rumelhart et al., 1986; Abdi, 1994).
The neural network was developed as a threelayer learning network, consisting of an input layer, a hidden layer and an output layer (Fig.5c). Each layer is made up of several nodes, and the layers are connected by sets of connection weights. The nodes receive input from either outside the model (the initial inputs) or from the connections. Nodes operate on the input transforming it to produce an analogue output. The weights function to multiply an incoming signal prior to its arrival at the next layer. The transformation associated with each node is a sigmoid function (Fig.5b).
NEURAL NETWORK ALGORITHM
Here, θ_{k} is the bias value and W_{ij} are connection weights between the input layer and hidden layer. U_{j}, which is called the normalized logistic function between [0,1], is the output of response function f and transfers H_{j} to next hidden layer as follows
a is sigmoid shape parameter and changed gradient of sigmoid (Fig.5b). The sigmoid function is a differentiable function defined as follows:
f'(U_{j})=(1/a)f(U_{j}){1f(U_{j})} (3)
The relations among O_{k} (output from output unit), H_{j} (output from hidden layer) and Z_{jk} (connecting weights between hidden and output layer) are given by
O_{k}=f(X_{k}), X_{k}=ΣZ_{jk}H_{j}θ_{k} (4)
The connection weights are modified step by step to minimize the mean square
error between the observed value and the predicted value. This process is called learning procedure and
use a gradientdescent method in order to minimize the mean square error ( Rumelhart
et al., 1986).
The sum of the square errors, E_{p} between the calculated output O_{k} and the observed data T_{k}, is described by
E_{p} is minimized by a gradientdescent method as follows:
The errors between calculated and observed data are
σ^{k}=T_{k}O_{k} (7)
used the differentiable sigmoid function, we can rewrite equation (6) as follows:
To avoid the oscillation at learning rate μ, is to make the change in weight dependent of the past weight change by adding a momentum term (α).
ΔZ_{jk}=μδ_{k}H_{j}+αΔZ_{jk} (10)
The differential of E_{p} with respect to W_{ij} is calculated as follows
