defined as R ¼
1 K
X
K i ¼ 1
R
i
22 In addition, the performance of training is assessed using
scatter plots of y
i
versus d
i
, and the scatter of ðd
i
, y
i
from the 458 line is assessed using two error bounds defined as
y
a
¼ 1 ¹ e
d 23a
y
b
¼ 1 þ e
d 23b
where e ¼ specified error in decimal values; y
a
¼ lower
error bound corresponding to e; and y
b
¼ upper error bound
corresponding to e. A scatter plot helps to assess ANN performance more effectively compared to R. In the
ANN testing phase, the objective is to match d
i
with y
i
at each ith output neuron for all the patterns in S
2
. As such, the performance of testing in predicting G· is assessed by
repeating the above procedure for S
2
. 6 ARTIFICIAL NEURAL NETWORK ASSESSMENT
Several example scenarios related to the GFCT model are solved to assess the ANN applicability. In general, the accu-
racy of ANN training is observed to be approximately 100, and the generalization of ANN testing is observed
to limit the applicability of ANN. In the next section, the applicability of ANN with respect to only ANN testing is
discussed.
6.1 Example 1: simulation of the BTC with linear adsorption
ANN was used to simulate the breakthrough concentration, C
, with n ¼ 1 Fig. 2. C is a nonlinear nonmonotonic
function of time, t, and helps to assess the applicability of ANN in simulating such a function. As the input is only time
I ¼ 1, a hidden layer of three neurons J ¼ 3 was used,
3
and a 1-3-1 ANN was used to simulate C . In predicting C
for 160 days, HYDRUS generates 1618 patterns of C , t;
these patterns are included in the training and testing subset, S, and the allocation method was used for allocating S to S
1
and S
2
. As no guideline for the allocation exists, ˜f ¼ 0:5 was used to minimize bias on the allocation, and r ¼ 1 was
arbitrarily taken. Fig. 3 shows the predicted C
when ANN is trained using sgm·. The results show that ANN fails to simulate C
correctly. In order to improve the ANN performance, attempts were made to train the ANN with ˜f ¼ {0.25,
0.75}, r ¼ {5, 257}, and J ¼ {15, 30}; yet, the performance did not improve. Thereafter, ANN was trained using tanh·
as the transfer function. The results in Fig. 3 show that the performance of ANN improves dramatically, and the ANN
induced small errors in the breakthrough time and maximum concentration. Fig. 4 shows the scatter plot for desired
HYDRUS and predicted ANN responses, and R ¼ 0.995 where R is the correlation coefficient. Although R is
high, a few large errors are observed for C 3 mg L
¹ 1
, and errors of 10–12 are observed for C . 3 mg L
¹ 1
. As such, R alone appears not to be a robust index for assessing the
performance of ANN, and scatter plots with appropriate error bounds are needed to assess the performance robustly.
Finally, the sensitivity of the ANN to ˜f ¼ {0.25, 0.5, 0.75}, r ¼ {1, 5, 257}, and J ¼ {3, 15, 30} were analyzed, and the
corresponding R values are observed to vary over R [ [0.994, 0.998]. Although not shown here, these results sug-
gested that ANN is insensitive to these parameters for the present example.
As sgm· and tanh· are considered equally applicable,
2
the present example suggests that these two functions may be regarded as complementary. In cases where one function
fails to perform robustly, the other may be used. Also, a careful examination of these results reveal one important
aspect of ANN training. C
is mostly small within the 160 days, except near the peak concentration times, and
this may have caused the poor performance of sgm·.
12
Both sgm· and tanh· are s-shaped functions, but they
16
Ð 12
8
2
Concentration µ
gL
20 40
60 80
100 120
140 160
Time days ANN tanh
Hydrus
14 10
4 6
ANN sgm
Fig. 3. Breakthrough concentration of Example 1 with linear adsorption using HYDRUS and 1-3-1 ANN with two different
transfer functions
16 12
8
2
Predicted concentration µ
gL
2 4
6 8
10 12
14 16
+20
14 10
4 6
Desired concentration µ
gL Ð20
Ð10 +10
Fig. 4. Performance of 1-3-1 ANN in predicting the breakthrough
concentration of Example 1
152 J. Morshed, J. J. Kaluarachchi
differ in their output ranges Fig. 1b. The output range of sgm· is 0, 1 while the output range of tanh· is ¹ 1,
þ 1. As BPA uses the output of a transfer function as a
multiplier in the weight update, sgm· produces a small multiplier when the summation is small, and vice versa.
Therefore, there is a bias towards training higher desired outputs. In contrast, tanh· produces equal multipliers
when the summation is either small or large and, therefore, tanh· leads to no bias towards training lower or higher
desired outputs.
The ANN results are further studied to determine the effect of the number of hidden nodes on the weight distribu-
tion. Fig. 5 shows the weight histograms of 1-J-1 ANN for J ¼ {3, 15, 30}, and the weights are observed to decrease
with increasing J. As J is increased, more terms are consid- ered in the argument of a output neuron eqn 2b, and its
weighted sum becomes very small or large if the weights do not decrease. At a very small or a large sum, the output of a
transfer function is essentially independent of the sum Fig. 1b, and the ANN may fail to approximate the input–
output response robustly. As such, BPA decreases weights with increasing J to train the ANN robustly and makes ANN
insensitive to J over a wide range. However, this observa- tion contradicts recent observations by others suggesting
poor ANN testing with increasing J beyond an optimal value.
4-5
As such, the present example suggests that ANN is sufficiently, but not completely, insensitive to J around
the optimal value.
6.2 Example 2: simulation of BTC with nonlinear adsorption