Timber Defect Detection Based On Systematic Feature Analysis And One Class Classifier.

i

TIMBER DEFECT DETECTION BASED ON SYSTEMATIC FEATURE
ANALYSIS AND ONE CLASS CLASSIFIER

UMMI RABA’AH BINTI HASHIM

A thesis submitted in fulfilment of the
requirements for the award of the degree of
Doctor of Philosophy (Computer Science)

Faculty of Computing
Universiti Teknologi Malaysia

DECEMBER 2015

iii

DEDICATION

To my beloved husband, children, parents and brothers.


iv

ACKNOWLEDGEMENT

In the name of Allah, most gracious, most merciful. Praise to Allah, for
guiding me in the right path, blessing me with the best in this life. It takes the efforts
and supports of many to bring this research study to completion. I am indebted to the
dozens of people guiding and supporting me throughout this study. I would like to
express my gratitude to the following special individuals:
1. My supervisor and co-supervisor, Assoc. Prof. Dr. Siti Zaiton binti Mohd
Hashim and Assoc. Prof. Dr. Azah Kamilah Muda, for their wonderful
guidance and continuous encouragement during the progression of my study.
2. Academicians of UTM, for their valuable teaching, comment, idea and
motivation for this research.
3. Industry experts from Hasro Malaysia, Teras Puncak and Elegant Success
(Malaysian wood products manufacturers) for their co-operation, invaluable
consultation and kind support.
4. Universiti Teknikal Malaysia Melaka (UTeM) and Ministry of Education
Malaysia for their generous financial support.

5. My husband and children, for their patience and love.
6.

.

My parents and brothers, for their blessing and care.

v

ABSTRACT

Substantial research effort has been done in the automation of timber defect
detection to improve the quality of timber products, optimise raw material resources,
increase productivity and reduce error related to human labour. This study extends
the work on automated inspection of timber boards to Malaysian timber species
hoping that the outcome will benefit the local wood product industries. This study
aims to propose a timber surface defect detection approach which is robust in
detecting various defects on multiple timber species using significant texture
features, validated using data from local timber species. In the experiments, defective
samples from Malaysian Hardwood are collected and labelled under supervision of

industry experts. Additionally, this work gives new insight into the characterisation
of timber defect images by using statistical texture from orientation independent
Grey Level Dependence Matrix (GLDM) with appropriate parameter analysis. A
Systematic Feature Analysis (SFA) which includes exploratory and confirmatory
multivariate analysis was performed to investigate the discriminative power of the
proposed feature set. The SFA produces a feature set of timber surface defects
capable of providing significant discrimination between defects and clear wood
classes. Finally, a new concept in the domain of timber defect detection based on
outlier detection concept was introduced to overcome the problem of imbalanced
data. This study proposes a robust Mahalanobis one class classifier (MC) with Fast
Minimum Covariance Determinant estimator (MC-FMCD) for species independent
timber defect detection. The experimental results show that the proposed approach
achieved superior performance over the classical Mahalanobis Distance (MD) and
robust in detecting many types of defects across timber species.

vi

ABSTRAK

Pelbagai usaha penyelidikan telah dilaksanakan dalam pengesanan kecacatan

kayu secara automatik untuk meningkatkan kualiti produk kayu, mengoptimumkan
sumber bahan mentah dan meningkatkan produktiviti. Kajian dalam bidang ini telah
dilanjutkan kepada spesies kayu Malaysia dengan harapan bahawa hasilnya akan
memberi manfaat kepada industri produk kayu tempatan. Kajian ini bertujuan untuk
mencadangkan pengesanan kecacatan permukaan kayu yang teguh dalam mengesan
pelbagai kecacatan pada pelbagai spesies kayu menggunakan ciri tekstur yang
signifikan serta disahkan menggunakan data dari spesies kayu tempatan. Sampel
kecacatan dari spesies kayu keras Malaysia dikumpul dan dilabel di bawah
pengawasan pakar-pakar industri untuk digunakan dalam kajian ini. Selain itu, kajian
ini memberi pemahaman baru dalam perwakilan atribut imej kecacatan kayu dengan
menggunakan tekstur statistik dari Matriks Pergantungan Aras Kelabu (GLDM)
berorientasi bebas berserta dengan analisa parameter yang bersesuaian. Satu
Penilaian Atribut Sistematik (SFA) merangkumi analisa eksplorasi dan pengesahan
multivariat telah dijalankan untuk mengkaji kuasa diskriminasi set atribut yang
dicadangkan. SFA tersebut telah menghasilkan perwakilan atribut yang mampu
membezakan antara kelas-kelas kecacatan kayu dan kayu baik secara signifikan.
Akhirnya, satu konsep baru dalam domain pengesanan kecacatan kayu yang
berdasarkan pengesanan anomali telah diperkenalkan untuk menangani masalah data
tidak seimbang. Kajian ini mencadangkan satu pengelas tunggal Mahalanobis (MC)
yang teguh dengan penganggar Penentu Kovarians Minimum Pantas (MC-FMCD)

untuk pengesanan kecacatan kayu tanpa mengira spesies kayu. Hasil eksperimen
menunjukkan bahawa pendekatan yang dicadangkan berjaya mencapai prestasi yang
lebih baik jika dibandingkan dengan Jarak Mahalanobis (MD) klasik dan berupaya
mengesan pelbagai jenis kecacatan pada pelbagai spesies kayu.

vii

TABLE OF CONTENTS

CHAPTER

1

TITLE

PAGE

DECLARATION

ii


DEDICATION

iii

ACKNOWLEDGEMENT

iv

ABSTRACT

v

ABSTRAK

vi

TABLE OF CONTENTS

vii


LIST OF TABLES

xii

LIST OF FIGURES

xiv

LIST OF ABBREVIATIONS

xvii

LIST OF APPENDICES

xx

TERMS AND DEFINITIONS

xxi


INTRODUCTION

1

1.1 Overview

1

1.2 Research Background

2

1.3 Problem Statement and Research Aim

13

1.4 Research Objective

14


1.5 Research Scope

14

1.6 Significance of the Study

16

1.7 Research Methodology

17

1.8 Research Contribution

19

1.9 Thesis Structure

19


viii
2

LITERATURE REVIEW

21

2.1 Introduction

21

2.2 Overview of Timber Process

26

2.3 Malaysian Timber Species

28


2.4 Timber Defects

31

2.5 Automated Vision Inspection (AVI) of Timber

33

2.5.1 Problem Background

33

2.5.2 AVI in Wood Industry

34

2.5.3 Sensors Used for AVI in Wood Industry

39

2.5.4 General Timber Defect Detection Approach

43

2.5.5 Feature Extraction on Defect Images

46

2.5.6 Defect Classification

50

2.5.7 Discussion

53

2.6 Statistical Texture Feature Based on Grey Level
Dependence Matrix (GLDM)

55

2.6.1 Problem Background

55

2.6.2 Orientation Independent GLDM

58

2.6.3 Statistical Features of GLDM

63

2.7 One Class Classification for Imbalanced Data

71

2.7.1 Introduction and Problem Background

71

2.7.2 Distance-based One Class Classifier (OCC)

73

2.7.3 Fast Minimum Covariance Determinant as Robust
Estimator

3

77

2.8 Summary

81

RESEARCH METHODOLOGY

82

3.1 Introduction

82

3.2 Problem Situation and Solution Concept

82

3.3 Research Design

87

3.3.1 Research Framework

87

3.3.2 Operational Framework

88

ix
3.3.2.1 Phase 1: Construction of timber defect
image dataset of Malaysian hardwood

89

3.3.2.2 Phase 2: Identification of significant texture
feature set representing timber defect.

90

3.3.2.3 Phase 3: Development of robust OCC with
FMCD estimator for timber defect detection
3.3.3 Overall Research Plan
3.4 Evaluation Measurement

91
92
95

3.4.1 Multivariate Analysis of Variance (Manova) to
Evaluate Feature Quality

95

3.4.2 Precision, Recall and F Measure to Measure
Detection Performance

100

3.4.3 Over Detection and Under Detection Errors to
Assess Segmentation Quality
3.5 Summary

4

5

102
103

CONSTRUCTION OF TIMBER SURFACE DEFECT
IMAGE DATASET

104

4.1 Introduction

104

4.1 Timber Samples Collection

106

4.2 Image Acquisition Setup

106

4.3 Image Labelling and Processing

110

4.4 Findings

113

4.5 Summary

116

SIGNIFICANT FEATURE SET OF TIMBER SURFACE
DEFECTS BASED ON STATISTICAL TEXTURE AND
SYSTEMATIC FEATURE ANALYSIS

117

5.1 Introduction

117

5.2 Overview of Approach

118

5.3 Feature Extraction

121

x
5.3.1 Extracting Statistical Features from GLDM

121

5.3.2 Exploring Displacement and Quantization Parameter
of GLDM

127

5.4 Evaluation of Feature Quality

133

5.4.1 Exploratory Feature Analysis

133

5.4.1.1 Univariate Feature Range Analysis

134

5.4.1.2 Bivariate Matrix of Scatter Plot

136

5.4.1.3 Multivariate Intra-Class and Inter-Class
Distance between Clear Wood and Defects
5.4.2 Confirmatory Feature Analysis
5.4.2.1 Removing Linearly Dependent Features

137
139
141

5.4.2.2 Measuring Significant Difference between
Defect Classes using Manova Statistics

143

5.4.2.3 Identifying Significant Features using Posthoc Manova (Discriminant Analysis)
5.5 Performance Validation

145
149

5.5.1 Measuring Classification Performance across
Feature Sets and Classifiers

150

5.5.2 Measuring Classification Performance of Individual
Classes

153

5.5.3 Measuring Classification Accuracy across Timber
Species

6

156

5.6 Discussion

158

5.7 Summary

159

ROBUST MAHALANOBIAN CLASSIFIER WITH FMCD
ESTIMATOR (MC-FMCD) FOR TIMBER DEFECT
DETECTION

160

6.1 Introduction

160

6.2 Overview of Approach

161

6.3 Experimental Setting for Simulated Datasets

163

xi
6.4 Experimental Results for Simulated Datasets

165

6.4.1 Detection Peformance across Various Defect Ratios

166

6.4.2 Detection Performance by Defect Type

170

6.4.3 Detection Performance between Classic MD and
Robust MC-FMCD

174

6.4.4 Summary of Detection Performance across Timber
Species

7

178

6.5 Expert Validation on Test Images

180

6.6 Discussion

185

6.7 Summary

186

CONCLUSION AND FUTURE RESEARCH

188

7.1 Summary of Research Finding

188

7.2 Research Contribution

191

7.3 Future Work Recommendation

193

7.4 Concluding Remark

195

REFERENCES
Appendices A - N

196
213 - 297

xii

LIST OF TABLES

TABLE NO.

TITLE

PAGE

2.1

List of Malaysian timber classification based on density
(MTIB, 2000)

29

2.2

Natural durability classification based on years (MTIB, 2000)

29

2.3

Characteristics of four types of timber species (MTIB, 2000)

30

2.4

List of common timber defect

32

2.5

Related works on automated inspection of wood products

36

2.6

Related studies on inspection of external wood defects

40

2.7

Images of directional matrices and rotation invariant matrix

61

3.1

Problem leading to solution

86

3.2

Overall research plan

92

3.3

Confusion matrix

102

4.1

List of data collection setting of past studies on timber
surface defect detection

109

4.2

List of classes with example of sub-images collected

114

4.3

Number of samples collection across species

116

5.1

Example of sub-image and the corresponding dependence matrix 123

5.2

List of statistical texture features extracted

124

5.3

Example of extracted features (one sample per class,
species=Meranti, d=1, q=32)

125

5.4

Texture characteristics of clear wood and defect

126

xiii
‎5.5

Distances between test samples and independent clear wood
samples

142

5.6

List of feature correlation with r>0.99

142

5.7

List of features removed after correlation test

143

5.8

Box's test of equality of covariance matrices

144

5.9

Manova test

144

5.10

Pillai’s Trace value across multiple quantization levels and
displacements

145

5.11

Eigenvalues and canonical correlations

146

5.12

Raw and standardized discriminant function coefficients
(Root 1)

147

5.13

Correlation between features and canonical variable

148

5.14

List of remaining features after discriminant analysis

148

5.15

List of feature sets used for performance comparison

150

5.16

Confusion matrices for D7, D5 and D4

154

5.17

Samples mistakenly classified as clear wood (undetected
defect)

155

5.18

Confusion matrices for Merbau, KSK and Rubberwood

157

6.1

Experimental Meranti dataset for various defect ratios

163

6.2

Detection performance by defect ratio

167

6.3

Detection performance by defect types

170

6.4

Detection performance on test images: Rubberwood

181

6.5

Detection performance on test images: KSK

182

6.6

Detection performance on test images: Meranti

183

6.7

Detection performance on test images: Merbau

184

xiv

LIST OF FIGURES

FIGURE NO.

TITLE

PAGE

‎1.1

Motivation of the study

12

‎1.2

Overview of research phases

18

2.1

Taxonomy of literature review

23

‎2.2

Timber process

26

‎2.3

Log cutting pattern (Cavette, 2006; Tom & Jeff, 2010)

27

‎2.4

The components of an AVI system in wood industry

35

‎2.5

Reference pixel, X with its 8 neighbouring pixels
(Haralick et al., 1973)

59

‎2.6

Distribution of non-zero matrix element on the left, and
contour plot showing joint probability density function of
the spatial dependence matrix on the right.

62

‎2.7

Research solutions to the problem of classification of
imbalanced data (Sun et al., 2009)

73

‎3.1

Solution concept for timber defect detection

85

‎3.2

Research framework

88

‎3.3

Operational research framework

89

‎4.1

Image acquisition setup

108

‎4.2

The process of dataset construction

111

‎4.3

Sample of acquired images

111

‎4.4

Subdivision of original image into sub-images

113

‎4.5

Distribution of defect samples across species

115

xv
‎5.1

Proposed approach in determining significant feature set

120

‎5.2

Procedures for extracting statistical texture features based on
GLDM

122

‎5.3

Pictorial representation of the orientation independent GLDM

128

‎5.4

Normalized feature means against displacement and
quantization

131

‎5.5

Energy feature range analysis

134

‎5.6

Entropy feature range analysis

135

‎5.7

Contrast feature range analysis

135

‎5.8

Scatter plot matrix showing pairwise comparison of features

136

‎5.9

Intra-class distance between clear wood samples and interclass distance between clear wood and defect samples

138

5.10

Procedures for confirmatory feature analysis

140

‎5.11

Classification accuracy of three proposed feature sets (D6,
D7 and D8)

151

‎5.12

Classification accuracy between the proposed feature set (D7)
and feature sets from previous studies

152

‎5.13

F scores for each class across datasets D4, D5 and D7

154

‎5.14

Classification accuracy across timber species

156

‎6.1

Flow of experiments for timber defect detection

161

‎6.2

Proposed MC-FMCD for robust timber defect detection

162

‎6.3

F score across defect ratio: (a) Meranti, (b) Rubberwood, (c)
KSK, (d) Merbau

168

‎6.4

OD Error and UD Error across defect ratio: (a) Meranti, (b)
Rubberwood, (c) KSK, (d) Merbau

169

‎6.5

F score by defect type: (a) Meranti, (b) Rubberwood, (c)
KSK, (d) Merbau

172

‎6.6

OD Error and UD Error by defect type: (a) Meranti, (b)
Rubberwood, (c) KSK, (d) Merbau

173

‎6.7

Detection performance for MC-FMCD and classic MD:
Meranti dataset

174

xvi
‎6.8

Detection performance for MC-FMCD and classic MD:
Rubberwood dataset

175

‎6.9

Detection performance for MC-FMCD and classic MD: KSK
dataset

176

‎6.10

Detection performance for MC-FMCD and classic MD:
Merbau dataset

177

‎6.11

Average detection performance by timber species

178

‎6.12

Average detection performance by defect type across timber
species (a) F score comparison between timber species by
defect type (b) Average F score by defect type

179

‎6.13

Average detection performance between MC-FMCD and
classic MD

180

‎6.14

Average detection performance validated by an expert

185

xvii

LIST OF ABBREVIATIONS

ANN

-

Artificial Neural Network

AUTOC

-

Autocorrelation

AVI

-

Automated Vision Inspection

BR

-

Brown Stain

BS

-

Blue Stain

CAR

-

Causal Auto Regressive Model

CCD

-

charged-coupled device

CL

-

Clear Wood

CONT

-

Contrast

COR

-

Correlation

CPROM

-

Cluster Prominence

CSHAD

-

Cluster Shade

CT

-

Computed Tomography

DENT

-

Difference entropy

DISS

-

Dissimilarity

DVAR

-

Difference variance

EN

-

Energy

ENT

-

Entropy

EPQ

-

Equal Probability Quantization

FMCD

-

Fast Minimum Covariance Determinant

FMMIS

-

Fuzzy Min-Max Neural Network for Image Segmentation

FN

-

False Negative

FP

-

False Positive

GA

-

Genetic Algorithm

GLDM

-

Grey Level Dependence Matrix

xviii
GPR

-

Ground Penetrating Radar

HL

-

Hole

HOMO

-

Homogeneity

IDMN

-

Inverse difference moment normalized

IDN

-

Inverse difference normalized

IMC1

-

Information measures of correlation 1

IMC2

-

Information measures of correlation 2

KN

-

Knot

KNN

-

K-nearest Neighbour

KSK

-

Kembang Semangkuk

LBP

-

Local Binary Pattern

MANOVA

-

Multivariate Analysis of Variance

MAXPR

-

Maximum probability

MCD

-

Minimum Covariance Determinant

MC-FMCD

-

Mahalanobian Classifier based on Robust FMCD

MD

-

Mahalanobis Distance

MGR

-

Malaysian Grading Rule

MIDA

-

Malaysian Investment Development Authority

MLP

-

Multi-layer Perceptron

MSE

-

Mean Square Error

MTIB

-

Malaysian Timber Industry Board

MVE

-

Minimum Volume Ellipsoid

MVV

-

Minimum Vector Variance

NATIP

-

National Timber Industry Policy

OCC

-

One Class Classifier

OD

-

Over Detection

PC

-

Pocket

RBFN

-

Radial Basis Function Network

RGB

-

Red Green Blue

RT

-

Rot

SAVG

-

Sum Average

SDM

-

Spatial Dependence Matrix

SENT

-

Sum Entropy

xix
SOM

-

Self-organizing Map

SOSVH

-

Sum of Squares: Variance

SP

-

Split

SSCP

-

Sum of Squares Cross Product

SVAR

-

Sum Variance

TN

-

True Negative

TP

-

True Positive

UD

-

Under Detection

WN

-

Wane

xx

LIST OF APPENDICES

APPENDIX
A

TITLE

PAGE

Related studies on inspection of internal wood
defects
Related studies on multi sensors approach to timber
defect detection

213

B

Example of orientation independent GLDM and
normalized GLDM

216

C

Plots of feature value against displacement and
quantization parameter

219

D

Univariate feature range analysis

236

E

Matrix of scatter plots comparing feature
distribution between classes

247

F

Pairwise correlation between features and its
corresponding significance, p value

249

G

SPSS Manova output

252

H

Experimental dataset for various defect ratios

260

I

Expert validation sheet

267

J

UTM letter of permission for data collection

280

K

Biography of industry experts

284

L

Letter of dataset certification

287

M

Photo album

291

N

List of Publication

297

xxi

TERMS AND DEFINITIONS

TERM

DEFINITION

Wood

A hard fibrous material that makes up most of the substance of a
tree

Log

A part of the trunk that has been cut off from a felled tree

Timber

Wood boards sawn from logs

Primary wood
industry

Businesses that process logs or other tree sections directly
into timber, veneer, plywood, wood chips or other primary
wood products.

Sawmill

A factory where logs are sawn into timbers

Secondary wood
industry

Businesses that process primary wood products such as timber
into secondary wood products such as furniture, doors, and
parquet flooring.

Rough mill

The first production area/stage in a secondary wood product
industry where timber is being moulded and cut into rough sized
components/parts. At this stage, undesirable characteristics or
defects are removed.

Defect

Flaws or anomalies found on timber that affect its properties and
limit its possible use.

Natural defect

Biological defects occurred during the growth of a tree where the
timber originates from.

Mechanical
defect

Defects that are caused by the handling or processing of timber,
such as during drying, sawing and moulding.

Internal defect

Defects that are found inside the timber structure

External defect

Defects that are found on the surface of timber