Appendix H Formal Models for Artificial Intelligence Methods: Mathematical Model of Neural Network Learning

Appendix H Formal Models for Artificial Intelligence Methods: Mathematical Model of Neural Network Learning

When we discussed neural networks in Chap. 11 , we introduced the basic model of their learning, namely the back propagation method. In the second section of this appendix we present a formal justification for the principles of the method. In the first section we introduce basic notions of mathematical analysis [ 121 , 122 , 237 ] that are used for this justification.

H.1 Selected Notions of Mathematical Analysis

Firstly, let us introduce notions of vector space and normed vector space. Definition H.1 Let V be a nonempty set closed under an addition operation +, and

be a field. Let · be an external operation of the left-hand side multiplication, i.e. it is a mapping from K × V to V , where its result for a pair (a, w) ∈ K × V is denoted

a · w, briefly aw.

A vector space is a structure consisting of the set V , the field K, and operations +, ·, which fulfils the following conditions.

• The set V with the operation + is the Abelian group. • ∀a, b ∈ K, w ∈ V : a(bw) = (ab)w. • ∀a, b ∈ K, w ∈ V : (a + b)w = aw + bw. • ∀a ∈ K, w, u ∈ V : a(w + u) = aw + au. • ∀w ∈ V : 1 · w = w, where 1 is the identity element of multiplication in K.

Definition H.2 Let X be a vector space over a field K. A norm on X is a mapping " · " : X −→ R + fulfilling the following conditions.

• ∀x ∈ X: "x" = 0 ⇔ x = 0, where 0 is the zero vector in X. • ∀x ∈ X, λ ∈ K: "λx" = |λ| · "x". • ∀x, y ∈ X: "x + y" ≤ "x" + "y".

© Springer International Publishing Switzerland 2016 289 M. Flasi´nski, Introduction to Artificial Intelligence, DOI 10.1007/978-3-319-40022-8

290 Appendix H: Formal Models for Artificial Intelligence Methods … Definition H.3 Let " · " be a norm on a vector space X. A pair (X, " · ") is called a

normed vector space . Further on, we assume X is a normed vector space.

Now, we can define directional derivative, partial derivative and gradient. Let U ⊂ X be an open subset of X.

Definition H.4

X . If there exists a limit of a difference quotient

then this limit is called a directional derivative of the function f along the vector v at the point a, denoted ∂ v f (a) .

Let X = R n , and vectors e 1 = (1, 0, 0, . . . , 0), e 2 = (0, 1, 0, . . . , 0), . . . , e n = ( 0, 0, 0, . . . , 1) constitute a canonical basis for a space X. Let U ⊂ X be an open subset of X.

Definition H.5 If there exist directional derivatives ∂ e 1 f (a), ∂ e 2 f (a), . . . , ∂ e n f (a) of a function f : U −→ R along vectors of the canonical basis e 1 , e 2 ,..., e n ,

then they are called partial derivatives of the function f at the point a, denoted ∂f

∂x 1 ∂x 2 ∂x n Let f : U −→ R be a function, where the set U ⊂ R n is an open set. Let us

∂f assume that there exist partial derivatives:

( a), . . . , ( a) at the point ∂x 1 ∂x 2 ∂x n

a ∈ U. Definition H.6

A vector

is called a gradient of the function f at the point a. Theorem H.1 At a given point, a directional derivative has the maximum absolute

value in the direction of the gradient vector. Thus, a function increases (or decreases) most rapidly in the gradient direction.

We will make use of this property in the next section.

H.2 Backpropagation Learning of Neural Networks

In this section we introduce a formalization of the backpropagation method of neural network learning [ 252 ], which was presented in an intuitive way in Chap. 11 . Firstly, let us discuss its general idea.

Appendix H: Formal Models for Artificial Intelligence Methods … 291 We learn a neural network, i.e. we modify its weights, in order to minimize an

error function of a classification of vectors belonging to the training set. All the weights of a neural network are variables of this function. Let us denote this function

with E(W), where W = (W 1 , W 2 ,..., W N ) is a vector of weights of all the neurons. At the j-th step of a learning process we have an error E(W(j)), briefly E(j). This

error will be minimized with the method of steepest descent, which can be defined in the following way.

(H.1) ∂E(j) ∂E(j)

W(j + 1) = W(j) − α∇E(W(j)),

∂E(j)

where ∇E(W(j)) =

is a gradient of the function

∂W 1 ( j) ∂W 2 ( j)

∂W N ( j)

E . Now, let us introduce denotations according to those used in Chap. 11 .N ( r)(k) denotes the k-th neuron of the r-th layer. Let us assume that a network consists of L layers, and the r-th layer consists of M r neurons. The output signal of the k-th neuron of the r-th layer at the j-th step of learning is denoted with y ( r)(k) ( j) . The input signal at the i-th input of the k-th neuron of the r-th layer at the j-th step of learning is denoted with X ( r)(k) ( j) , and the corresponding weight is denoted with W ( r)(k) i i ( j) .

Let us define a function E as a mean squared error function at the output of the network, i.e.

E(j) =

( u ( m) ( j)

2 −y

( L)(m)

where u ( m) ( j) is a required output signal for the m-th neuron of the L-th layer at the j -th step.

First of all, let us define a formula for a value of the i-th weight of the k-th neuron of the r-th layer at the (j + 1)-th step of learning. From formulas ( H.1 ) and ( H.2 ) we

obtain

( r)(k)

( r)(k)

∂E(j)

W i ( j + 1) = W i

( j) −α

∂W ( r)(k) i ( j)

∂v ( ( j)

r)(k)

=W i

( r)(k)

∂E(j)

( j) −α ( r)(k)

( j) · ∂W ( r)(k) i ( j)

(H.3)

∂v

( r)(k)

∂E(j)

=W i

( j)

( −α r)(k)

·X i ( j) .

∂v

( r)(k)

( j)

Now, let us introduce the following denotation in the formula ( H.3 )

δ ( r)(k)

∂E(j)

( j) =−

(H.4)

∂v ( r)(k)

( j)

Then, we obtain the following formula.

( W r)(k)

( r)(k)

( + 1) = W r)(k)

( r)(k)

( j) + αδ

( j)X i ( j) . (H.5)

292 Appendix H: Formal Models for Artificial Intelligence Methods … The formula ( H.5 ) is analogous to the formula ( 11.16 ) in Sect. 11.2 including a

description of the back propagation method. 23 At the end of our considerations, we should derive a formula for δ ( r)(k) ( j) . Let us determine it, firstly, for neurons of the input layer and hidden layers.

∂v ( r +1)(m) ( j) δ

∂E(j)

∂E(j)

( r)(k)

( j) =− ∂v ( r)(k) ( j) =−

. ( j) m · ∂v ( r)(k)

(H.6) ( j)

∂v ( r +1)(m)

By applying the formula ( H.4 ) and making use of the formula ( 11.1 ) introduced in Sect. 11.2 , we receive

W i +1)(m) ( j)X i +1)(m) ( j) δ

( r ( r)(k)

( r +1)(m)

∂v ( r)(k)

( j) From the formula ( 11.12 ) introduced in Sect. 11.2 we find that

( r +1)(m) ( r)(i)

δ ( r)(k) ( j) δ ( = r +1)(m)

( j)y ( j)

Dokumen yang terkait

Hubungan pH dan Viskositas Saliva terhadap Indeks DMF-T pada Siswa-siswi Sekolah Dasar Baletbaru I dan Baletbaru II Sukowono Jember (Relationship between Salivary pH and Viscosity to DMF-T Index of Pupils in Baletbaru I and Baletbaru II Elementary School)

0 46 5

Institutional Change and its Effect to Performance of Water Usage Assocition in Irrigation Water Managements

0 21 7

The Effectiveness of Computer-Assisted Language Learning in Teaching Past Tense to the Tenth Grade Students of SMAN 5 Tangerang Selatan

4 116 138

the Effectiveness of songs to increase students' vocabuloary at second grade students' of SMP Al Huda JAkarta

3 29 100

The effectiveness of classroom debate to improve students' speaking skilll (a quasi-experimental study at the elevent year student of SMAN 3 south Tangerang)

1 33 122

Kerjasama ASEAN-China melalui ASEAN-China cooperative response to dangerous drugs (ACCORD) dalam menanggulangi perdagangan di Segitiga Emas

2 36 164

The Effect of 95% Ethanol Extract of Javanese Long Pepper (Piper retrofractum Vahl.) to Total Cholesterol and Triglyceride Levels in Male Sprague Dawley Rats (Rattus novergicus) Administrated by High Fat Diet

2 21 50

Factors Related to Somatosensory Amplification of Patients with Epigas- tric Pain

0 0 15

The Concept and Value of the Teaching of Karma Yoga According to the Bhagavadgita Book

0 0 9

Pemanfaatan Permainan Tradisional sebagai Media Pembelajaran Anak Usia Dini untuk Mengembangkan Aspek Moral dan Bahasa Anak Utilization of Traditional Games as Media Learning Early Childhood to Develop Aspects of Moral and Language Children Irfan Haris

0 0 11