Example 1: simulation of the BTC with linear adsorption

defined as R ¼ 1 K X K i ¼ 1 R i 22 In addition, the performance of training is assessed using scatter plots of y i versus d i , and the scatter of ðd i , y i from the 458 line is assessed using two error bounds defined as y a ¼ 1 ¹ e d 23a y b ¼ 1 þ e d 23b where e ¼ specified error in decimal values; y a ¼ lower error bound corresponding to e; and y b ¼ upper error bound corresponding to e. A scatter plot helps to assess ANN performance more effectively compared to R. In the ANN testing phase, the objective is to match d i with y i at each ith output neuron for all the patterns in S 2 . As such, the performance of testing in predicting G· is assessed by repeating the above procedure for S 2 . 6 ARTIFICIAL NEURAL NETWORK ASSESSMENT Several example scenarios related to the GFCT model are solved to assess the ANN applicability. In general, the accu- racy of ANN training is observed to be approximately 100, and the generalization of ANN testing is observed to limit the applicability of ANN. In the next section, the applicability of ANN with respect to only ANN testing is discussed.

6.1 Example 1: simulation of the BTC with linear adsorption

ANN was used to simulate the breakthrough concentration, C , with n ¼ 1 Fig. 2. C is a nonlinear nonmonotonic function of time, t, and helps to assess the applicability of ANN in simulating such a function. As the input is only time I ¼ 1, a hidden layer of three neurons J ¼ 3 was used, 3 and a 1-3-1 ANN was used to simulate C . In predicting C for 160 days, HYDRUS generates 1618 patterns of C , t; these patterns are included in the training and testing subset, S, and the allocation method was used for allocating S to S 1 and S 2 . As no guideline for the allocation exists, ˜f ¼ 0:5 was used to minimize bias on the allocation, and r ¼ 1 was arbitrarily taken. Fig. 3 shows the predicted C when ANN is trained using sgm·. The results show that ANN fails to simulate C correctly. In order to improve the ANN performance, attempts were made to train the ANN with ˜f ¼ {0.25, 0.75}, r ¼ {5, 257}, and J ¼ {15, 30}; yet, the performance did not improve. Thereafter, ANN was trained using tanh· as the transfer function. The results in Fig. 3 show that the performance of ANN improves dramatically, and the ANN induced small errors in the breakthrough time and maximum concentration. Fig. 4 shows the scatter plot for desired HYDRUS and predicted ANN responses, and R ¼ 0.995 where R is the correlation coefficient. Although R is high, a few large errors are observed for C 3 mg L ¹ 1 , and errors of 10–12 are observed for C . 3 mg L ¹ 1 . As such, R alone appears not to be a robust index for assessing the performance of ANN, and scatter plots with appropriate error bounds are needed to assess the performance robustly. Finally, the sensitivity of the ANN to ˜f ¼ {0.25, 0.5, 0.75}, r ¼ {1, 5, 257}, and J ¼ {3, 15, 30} were analyzed, and the corresponding R values are observed to vary over R [ [0.994, 0.998]. Although not shown here, these results sug- gested that ANN is insensitive to these parameters for the present example. As sgm· and tanh· are considered equally applicable, 2 the present example suggests that these two functions may be regarded as complementary. In cases where one function fails to perform robustly, the other may be used. Also, a careful examination of these results reveal one important aspect of ANN training. C is mostly small within the 160 days, except near the peak concentration times, and this may have caused the poor performance of sgm·. 12 Both sgm· and tanh· are s-shaped functions, but they 16 Ð 12 8 2 Concentration µ gL 20 40 60 80 100 120 140 160 Time days ANN tanh Hydrus 14 10 4 6 ANN sgm Fig. 3. Breakthrough concentration of Example 1 with linear adsorption using HYDRUS and 1-3-1 ANN with two different transfer functions 16 12 8 2 Predicted concentration µ gL 2 4 6 8 10 12 14 16 +20 14 10 4 6 Desired concentration µ gL Ð20 Ð10 +10 Fig. 4. Performance of 1-3-1 ANN in predicting the breakthrough concentration of Example 1 152 J. Morshed, J. J. Kaluarachchi differ in their output ranges Fig. 1b. The output range of sgm· is 0, 1 while the output range of tanh· is ¹ 1, þ 1. As BPA uses the output of a transfer function as a multiplier in the weight update, sgm· produces a small multiplier when the summation is small, and vice versa. Therefore, there is a bias towards training higher desired outputs. In contrast, tanh· produces equal multipliers when the summation is either small or large and, therefore, tanh· leads to no bias towards training lower or higher desired outputs. The ANN results are further studied to determine the effect of the number of hidden nodes on the weight distribu- tion. Fig. 5 shows the weight histograms of 1-J-1 ANN for J ¼ {3, 15, 30}, and the weights are observed to decrease with increasing J. As J is increased, more terms are consid- ered in the argument of a output neuron eqn 2b, and its weighted sum becomes very small or large if the weights do not decrease. At a very small or a large sum, the output of a transfer function is essentially independent of the sum Fig. 1b, and the ANN may fail to approximate the input– output response robustly. As such, BPA decreases weights with increasing J to train the ANN robustly and makes ANN insensitive to J over a wide range. However, this observa- tion contradicts recent observations by others suggesting poor ANN testing with increasing J beyond an optimal value. 4-5 As such, the present example suggests that ANN is sufficiently, but not completely, insensitive to J around the optimal value.

6.2 Example 2: simulation of BTC with nonlinear adsorption