Lecture ANN 4
4
Perceptron Learning Rule
1
4
Learning Rules
• Supervised Learning
Network is provided with a set of examples
of proper network behavior (inputs/targets)
{p1, t 1} , {p2, t 2} , … , {pQ, tQ}
• Reinforcement Learning
Network is only provided with a grade, or score,
which indicates network performance
• Unsupervised Learning
Only network inputs are available to the learning
algorithm. Network learns to categorize (cluster)
the inputs.
2
Perceptron Architecture
4
w 1, 1 w 1 , 2 … w 1 , R
Input
p
Rx1
Hard Limit Layer
AA
A
A AA
A AA
W =
w S, 1 w S , 2 … w S , R
a
W
SxR
Sx1
n
Sx1
1
R
b
Sx1
w 2, 1 w 2 , 2 … w 2 , R
S
a = hardlim (Wp + b)
iw
=
T
wi, 1
1w
wi, 2
W = 2w
T
wi, R
T
Sw
T
a i = hardlim ( n i ) = hardlim ( iw p + b i )
3
Single-Neuron Perceptron
4
w 1, 1 = 1
Inputs
w 1, 2 = 1
Two-Input Neuron
AA
AA
AAAA
p1
w1,1
p2
w1,2
Σ
n
a
b = –1
p2
a=1
T
1
1w p + b = 0
1
w
b
1
a = hardlim (Wp + b)
p1
a=0
1
T
a = hardlim ( 1w p + b ) = hardlim ( w 1, 1 p 1 + w1, 2 p 2 + b )
4
Decision Boundary
4
T
1w p + b = 0
T
1w p = – b
• All points on the decision boundary have the same inner
product with the weight vector.
• Therefore they have the same projection onto the weight
vector, and they must lie on a line orthogonal to the
weight vector
w Tp + b = 0
1
1
1
w
w
1
w
5
4
0 , t = 0
=
p
1
1
0
Example - OR
0 , t = 1
=
p
2
2
1
1 , t = 1
=
p
3
3
0
1 , t = 1
=
p
4
4
1
6
OR Solution
4
OR
w
1
Weight vector should be orthogonal to the decision boundary.
0.5
w
=
1
0.5
Pick a point on the decision boundary to find the bias.
T
1w p + b = 0.5 0.5
0 + b = 0.25 + b = 0
0.5
⇒
b = – 0.25
7
4
Multiple-Neuron Perceptron
Each neuron will have its own decision boundary.
T
iw p + b i = 0
A single neuron can classify input vectors
into two categories.
A multi-neuron perceptron can classify
input vectors into 2S categories.
8
4
Learning Rule Test Problem
{p 1, t 1} , { p 2, t 2} , …, {pQ, tQ}
1 , t = 1
p
=
1
1
2
–1 , t = 0
p
=
2
2
2
Inputs
0 , t = 0
p
=
3
3
–1
No-Bias Neuron
AA
AA
AAAA
p1
w1,1
p2
w1,2
Σ
n
a
a = hardlim(Wp)
9
Starting Point
4
2
1
Random initial weight:
1w
=
1.0
– 0.8
w
3
1
Present p1 to the network:
1
a = hardlim ( 1w p 1 ) = hardlim 1.0 – 0.8
2
T
a = hardlim ( – 0.6 ) = 0
Incorrect Classification.
10
Tentative Learning Rule
4
• Set 1w to p1
– Not stable
• Add p1 to 1w
Tentative Rule:
If t = 1 and a = 0, then 1w
new
= 1w
2
old
+p
1
w
1
1w
new
= 1w
old
+ p1 =
1.0 + 1 = 2.0
– 0.8
2
1.2
3
11
Second Input Vector
4
T
a = hardlim ( 1w p 2 ) = hardlim 2.0 1.2 – 1
2
a = hardlim ( 0.4 ) = 1
Modification to Rule:
(Incorrect Classification)
If t = 0 and a = 1, then 1w
new
2
1w
new
= 1w
old
– p2 =
= 1w
old
–p
1
2.0
–1
3.0
–
=
1.2
2
– 0.8
w
3
1
12
Third Input Vector
4
0
a = hardlim ( 1w p 3 ) = hardlim 3.0 – 0.8
–1
T
a = hardlim ( 0.8 ) = 1
(Incorrect Classification)
2
1w
new
= 1w
old
– p3 =
1
3.0 – 0 = 3.0
– 0.8
–1
0.2
w
1
3
Patterns are now correctly classified.
If t = a, then 1w
new
= 1w
old
.
13
4
Unified Learning Rule
If t = 1 and a = 0, then 1w
If t = 0 and a = 1, then 1w
If t = a, then 1w
new
new
new
= 1w
old
= 1w
= 1w
old
+p
–p
old
e = t–a
If e = 1, then 1w
new
If e = – 1, then 1w
new
If e = 0, then 1w
1w
new
= 1w
old
b
= 1w
= 1w
new
+ e p = 1w
new
= b
old
old
old
= 1w
old
+e
+p
–p
old
+ (t – a)p
A bias is a
weight with
an input of 1.
14
4
Multiple-Neuron Perceptrons
To update the ith row of the weight matrix:
iw
new
bi
new
= iw
= bi
old
old
+ ei p
+ ei
Matrix form:
Wnew = W old + ep T
b
new
= b
old
+e
15
Apple/Banana Example
4
Training Set
–
1
t
p
=
,
=
1
1 1
1
–1
1
t
p
=
,
=
2
1 2
0
–1
Initial Weights
W = 0.5 – 1 – 0.5
b = 0.5
First Iteration
–1
a = hardlim ( Wp 1 + b ) = hardlim 0.5 – 1 – 0.5 1 + 0.5
–
1
a = hardlim ( – 0.5 ) = 0
W
new
= W
old
+ ep
b
T
new
e = t1 – a = 1 – 0 = 1
= 0.5 – 1 – 0.5 + ( 1 ) – 1 1 – 1 = – 0.5 0 – 1.5
= b
old
+ e = 0.5 + ( 1 ) = 1.5
16
Second Iteration
4
1
a = hardlim (Wp 2 + b) = hardlim ( – 0.5 0 – 1.5 1 + ( 1.5 ))
–1
a = hardlim (2.5) = 1
e = t2 – a = 0 – 1 = –1
W
new
= W
old
+ ep
b
T
= – 0.5 0 – 1.5 + ( – 1 ) 1 1 – 1 = – 1.5 – 1 – 0.5
new
= b
old
+ e = 1.5 + ( – 1 ) = 0.5
17
4
Check
–1
a = hardlim (Wp 1 + b) = hardlim ( – 1.5 – 1 – 0.5 1 + 0.5)
–1
a = hardlim (1.5) = 1 = t 1
1
a = hardlim (Wp 2 + b) = hardlim ( – 1.5 – 1 – 0.5 1 + 0.5)
–1
a = hardlim (– 1.5) = 0 = t 2
18
4
Perceptron Rule Capability
The perceptron rule will always
converge to weights which accomplish
the desired classification, assuming that
such weights exist.
19
4
Perceptron Limitations
Linear Decision Boundary
T
1w p + b = 0
Linearly Inseparable Problems
20
Perceptron Learning Rule
1
4
Learning Rules
• Supervised Learning
Network is provided with a set of examples
of proper network behavior (inputs/targets)
{p1, t 1} , {p2, t 2} , … , {pQ, tQ}
• Reinforcement Learning
Network is only provided with a grade, or score,
which indicates network performance
• Unsupervised Learning
Only network inputs are available to the learning
algorithm. Network learns to categorize (cluster)
the inputs.
2
Perceptron Architecture
4
w 1, 1 w 1 , 2 … w 1 , R
Input
p
Rx1
Hard Limit Layer
AA
A
A AA
A AA
W =
w S, 1 w S , 2 … w S , R
a
W
SxR
Sx1
n
Sx1
1
R
b
Sx1
w 2, 1 w 2 , 2 … w 2 , R
S
a = hardlim (Wp + b)
iw
=
T
wi, 1
1w
wi, 2
W = 2w
T
wi, R
T
Sw
T
a i = hardlim ( n i ) = hardlim ( iw p + b i )
3
Single-Neuron Perceptron
4
w 1, 1 = 1
Inputs
w 1, 2 = 1
Two-Input Neuron
AA
AA
AAAA
p1
w1,1
p2
w1,2
Σ
n
a
b = –1
p2
a=1
T
1
1w p + b = 0
1
w
b
1
a = hardlim (Wp + b)
p1
a=0
1
T
a = hardlim ( 1w p + b ) = hardlim ( w 1, 1 p 1 + w1, 2 p 2 + b )
4
Decision Boundary
4
T
1w p + b = 0
T
1w p = – b
• All points on the decision boundary have the same inner
product with the weight vector.
• Therefore they have the same projection onto the weight
vector, and they must lie on a line orthogonal to the
weight vector
w Tp + b = 0
1
1
1
w
w
1
w
5
4
0 , t = 0
=
p
1
1
0
Example - OR
0 , t = 1
=
p
2
2
1
1 , t = 1
=
p
3
3
0
1 , t = 1
=
p
4
4
1
6
OR Solution
4
OR
w
1
Weight vector should be orthogonal to the decision boundary.
0.5
w
=
1
0.5
Pick a point on the decision boundary to find the bias.
T
1w p + b = 0.5 0.5
0 + b = 0.25 + b = 0
0.5
⇒
b = – 0.25
7
4
Multiple-Neuron Perceptron
Each neuron will have its own decision boundary.
T
iw p + b i = 0
A single neuron can classify input vectors
into two categories.
A multi-neuron perceptron can classify
input vectors into 2S categories.
8
4
Learning Rule Test Problem
{p 1, t 1} , { p 2, t 2} , …, {pQ, tQ}
1 , t = 1
p
=
1
1
2
–1 , t = 0
p
=
2
2
2
Inputs
0 , t = 0
p
=
3
3
–1
No-Bias Neuron
AA
AA
AAAA
p1
w1,1
p2
w1,2
Σ
n
a
a = hardlim(Wp)
9
Starting Point
4
2
1
Random initial weight:
1w
=
1.0
– 0.8
w
3
1
Present p1 to the network:
1
a = hardlim ( 1w p 1 ) = hardlim 1.0 – 0.8
2
T
a = hardlim ( – 0.6 ) = 0
Incorrect Classification.
10
Tentative Learning Rule
4
• Set 1w to p1
– Not stable
• Add p1 to 1w
Tentative Rule:
If t = 1 and a = 0, then 1w
new
= 1w
2
old
+p
1
w
1
1w
new
= 1w
old
+ p1 =
1.0 + 1 = 2.0
– 0.8
2
1.2
3
11
Second Input Vector
4
T
a = hardlim ( 1w p 2 ) = hardlim 2.0 1.2 – 1
2
a = hardlim ( 0.4 ) = 1
Modification to Rule:
(Incorrect Classification)
If t = 0 and a = 1, then 1w
new
2
1w
new
= 1w
old
– p2 =
= 1w
old
–p
1
2.0
–1
3.0
–
=
1.2
2
– 0.8
w
3
1
12
Third Input Vector
4
0
a = hardlim ( 1w p 3 ) = hardlim 3.0 – 0.8
–1
T
a = hardlim ( 0.8 ) = 1
(Incorrect Classification)
2
1w
new
= 1w
old
– p3 =
1
3.0 – 0 = 3.0
– 0.8
–1
0.2
w
1
3
Patterns are now correctly classified.
If t = a, then 1w
new
= 1w
old
.
13
4
Unified Learning Rule
If t = 1 and a = 0, then 1w
If t = 0 and a = 1, then 1w
If t = a, then 1w
new
new
new
= 1w
old
= 1w
= 1w
old
+p
–p
old
e = t–a
If e = 1, then 1w
new
If e = – 1, then 1w
new
If e = 0, then 1w
1w
new
= 1w
old
b
= 1w
= 1w
new
+ e p = 1w
new
= b
old
old
old
= 1w
old
+e
+p
–p
old
+ (t – a)p
A bias is a
weight with
an input of 1.
14
4
Multiple-Neuron Perceptrons
To update the ith row of the weight matrix:
iw
new
bi
new
= iw
= bi
old
old
+ ei p
+ ei
Matrix form:
Wnew = W old + ep T
b
new
= b
old
+e
15
Apple/Banana Example
4
Training Set
–
1
t
p
=
,
=
1
1 1
1
–1
1
t
p
=
,
=
2
1 2
0
–1
Initial Weights
W = 0.5 – 1 – 0.5
b = 0.5
First Iteration
–1
a = hardlim ( Wp 1 + b ) = hardlim 0.5 – 1 – 0.5 1 + 0.5
–
1
a = hardlim ( – 0.5 ) = 0
W
new
= W
old
+ ep
b
T
new
e = t1 – a = 1 – 0 = 1
= 0.5 – 1 – 0.5 + ( 1 ) – 1 1 – 1 = – 0.5 0 – 1.5
= b
old
+ e = 0.5 + ( 1 ) = 1.5
16
Second Iteration
4
1
a = hardlim (Wp 2 + b) = hardlim ( – 0.5 0 – 1.5 1 + ( 1.5 ))
–1
a = hardlim (2.5) = 1
e = t2 – a = 0 – 1 = –1
W
new
= W
old
+ ep
b
T
= – 0.5 0 – 1.5 + ( – 1 ) 1 1 – 1 = – 1.5 – 1 – 0.5
new
= b
old
+ e = 1.5 + ( – 1 ) = 0.5
17
4
Check
–1
a = hardlim (Wp 1 + b) = hardlim ( – 1.5 – 1 – 0.5 1 + 0.5)
–1
a = hardlim (1.5) = 1 = t 1
1
a = hardlim (Wp 2 + b) = hardlim ( – 1.5 – 1 – 0.5 1 + 0.5)
–1
a = hardlim (– 1.5) = 0 = t 2
18
4
Perceptron Rule Capability
The perceptron rule will always
converge to weights which accomplish
the desired classification, assuming that
such weights exist.
19
4
Perceptron Limitations
Linear Decision Boundary
T
1w p + b = 0
Linearly Inseparable Problems
20