Implementation Of Diamond Search (DS) Algorithm For Motion Estimation Using MATLAB Software.

(1)

i

IMPLEMENTATION OF DIAMOND SEARCH (DS) ALGORITHM FOR MOTION ESTIMATION USING MATLAB SOFTWARE

SIOW CHAN HOEL

This report is submitted in partial fulfillment of the requirements for the award of Bachelor of Electronic Engineering (Telecommunication Electronics) With Honours

Faculty of Electronic and Computer Engineering University Technical Malaysia Malacca


(2)

ii

UNIVERSTI TEKNIKAL MALAYSIA MELAKA

FAKULTI KEJURUTERAAN ELEKTRONIK DAN KEJURUTERAAN KOMPUTER BORANG PENGESAHAN STATUS LAPORAN

PROJEK SARJANA MUDA II

Tajuk Projek :

Sesi

Pengajian : 2009/2010

Saya SIOW CHAN HOEL mengaku membenarkan Laporan Projek Sarjana Muda ini disimpan di Perpustakaan dengan syarat7syarat kegunaan seperti berikut:

1. Laporan adalah hakmilik Universiti Teknikal Malaysia Melaka.

2. Perpustakaan dibenarkan membuat salinan untuk tujuan pengajian sahaja.

3. Perpustakaan dibenarkan membuat salinan laporan ini sebagai bahan pertukaran antara institusi pengajian tinggi.

4. Sila tandakan ( ) :

SULIT*

(Mengandungi maklumat yang berdarjah keselamatan atau kepentingan Malaysia seperti yang termaktub di dalam AKTA RAHSIA RASMI 1972)

TERHAD* (Mengandungi maklumat terhad yang telah ditentukan oleh organisasi/badan di mana penyelidikan dijalankan)

Disahkan oleh:

……… (TANDATANGAN PENULIS)

……… (COP DAN TANDATANGAN PENYELIA) Alamat Tetap: No. 8,

Jalan Udang Gantung 8, Taman Megah Kepong 52100 K.L.

Tarikh: April 2010 Tarikh: April 2010

TIDAK TERHAD

Implementation of Diamond Search (DS) Algorithm for Motion Estimation using MATLAB software


(3)

iii

“I hereby declare that this report is the result of my own work except for quotes as cited in the references.”

Signature :

Author :


(4)

iv

“I hereby declare that I have read this report and in my opinion this report is sufficient in terms of the scope and quality for the award of bachelor of Electronic

Engineering (Computer Engineering) With Honours.”

Signature :

Supervisor’s Name :


(5)

v


(6)

vi

ACKNOWLEDGEMENT

.

I would like to express gratitude and thanks to my supervisor, Mr. Redzuan Bin Abdul Manap for his support and patience throughout the duration of the project. His encouragement and guidance are truly appreciated. Otherwise, this project has not been possible. I have learnt a lot under his guidance. In addition to that, I also would like to thanks my friends: Kong Khee Kien and Wong Cheong Lun. They are always help me when I face problem in this project. I am also grateful to my all friends who help me and giving me opinion during implementation of this project.


(7)

vii

ABSTRACT

In order to achieve high compression ratio in video coding as proposed in this project, a technique known as Block Matching Motion Estimation has been widely adopted in various coding standards. This technique is implemented conventionally by exhaustively testing all the candidate blocks within the search window .This type of implementation, called Full Search (FS) Algorithm, gives the optimum solution. However, substantial amount of computational workload is required in this algorithm. To overcome this drawback, many fast Block Matching Algorithm (BMAs) have been proposed and developed .Different search patterns and strategies are exploited in these algorithms in order to find the optimum motion vector with minimal number of required search points. The objectives of this project are to develop and implement Diamond Search (DS) algorithm in MATLAB. Besides, the obtained result is compared to FS algorithm as well as other common fast BMAs. Finally, a functional MATLAB program code is produced.


(8)

viii

ABSTRAK

Projek ini bertujuan untuk mencapai nisbah mampatan video yang tinggi dalam pengaturcaraan video. Satu teknik dikenali sebagai

secara konvenskyen telah menggunakan satu algoritma yang dikenali sebagai untuk menguji setiap blok dalam tingkap pencarian. Walaupun algoritma ini memberi kualiti video yang optima, namum, ianya mewujudkan beban pemprosesan yang banyak dan seterusnya melambatkan pemprosesan tersebut. Bagi mengatasi masalah pemprosesan itu, banyak algoritma telah dikaji dan dibangunkan. Pelbagai corak dan strategi telah dieksploitasi dalam algoritma7algoritma ini untuk vektor pengerakan yang optima dengan titik pencarian yang minimum. Tujuan utama projek ini ialah membangunkan dan melaksanakan algoritma yang bernama

algoritma dalam MATLAB. Hasil prestasi algoritma akan

dianalisiskan dan dibezakan dengan algoritma serta algoritma7algoritma yang lain. Akhirnya, satu kod pengaturcaran yang berfungsi dihasilkan.


(9)

ix

TABLE OF CONTENTS

CHAPTER TITLE PAGE

ACKNOWLEDGEMENT vi

ABSTRACT vii

ABSTRAK viii

TABLE OF CONTENTS ix

LIST OF TABLES xii

LIST OF FIGURES xiii

LIST OF ABBREVIATION xv

LIST OF APPENDIX xvii

I INTRODUCTION 1.1 PROJECT INTRODUCTION 1

1.2 PROBLEM STATEMENTS 2

1.3 OBJECTIVES 2

1.4 SCOPES OF WORK 3

1.5 THESIS STRUCTURE 3

II LITERATURE REVIEW 2.1 VIDEO CODING 5

2.2 MOTION ESTIMATION 6

2.3 BLOCK MATCHING ALGORITHM (BMA) 7


(10)

x

2.3.2 FOUR STEP SEARCH ALGORITHM 13

2.3.3 NEW THREE STEP SEARCH ALGORITHM 17

2.4 VIDEO SEQUENCE FORMAT 19

III METHODOLOGY

3.1 METHODOLOGY 20

3.1.1 PROJECT PLANNING 20

3.1.2 LITERATURE REVIEW 21

3.1.3 VIDEO UPLOADING USING MALTAB 21

3.1.4 FRAME EXTRACTION 22

3.1.5 BLOCK CONSTRUCTION 22

3.1.6 IMPLEMENTATION OF DIAMOND

SEARCH ALGORITHM 23

3.1.7 RECONSTRUCTION OF PREDICTED FRAME 24

3.1.8 PERFORMANCE ANALYSIS 25

3.1.8.1 PERFORMANCE ANALYSIS

PARAMETER 25

3.1.9 PRESENTATION OF RESULT 26

IV RESULT AND DISCUSSION

4.1 INTRODUCTION 27

4.2 FIRST STAGE RESULT 28

4.3 SECOND STAGE RESULT 29

4.4 THIRD STAGE RESULTS 37

4.5 PREDICTED FRAME 45


(11)

xi

V CONCLUSION AND RECOMMENDATION

5.1 CONCLUSION 49

5.2 RECOMMENDATIONS 50

REFERENCE 51


(12)

xii

LIST OF TABLES

NO TITLE PAGE

3.1 Video sequence being used 23

4.1 Average points for single frame simulation 28

4.2 Average PSNR for single frame simulation 28

4.3 Elapsed time for single frame simulation 29

4.4 Average points for 30 frames simulation 29

4.5 Average PSNR for 30 frames simulation 30

4.6 Elapsed time for 30 frames simulation 30

4.7 Average points for 100 frames simulation 37

4.8 Average PSNR for 100 frames simulation 37


(13)

xiii

LIST OF FIGURES

NO TITLE PAGE

2.1 Block based motion estimation 9

2.2 An appropriate search pattern support7circular area with radium 10 of 2 pixels. 2.3 LDSP 10

2.4 SDSP 10

2.5 Three possible cases 11

2.6 Search path example 12

2.7 Search patterns of the FSS. 14

2.8 Two large motion search paths of four step search algorithm. 15 2.9 Two small search paths of four step search algorithm. 16 2.10 Example of search pattern of NTSS 18

3.1 Flow chart of Diamond Search algorithm 24 4.1 Average Point for Akiyo Video (30 frames) 31

4.2 Average PSNR for Akiyo Video (30 frames) 31

4.3 Average Point for Salesman Video (30 frames) 32

4.4 Average PSNR for Salesman Video (30 frames) 32

4.5 Average Point for Foreman Video (30 frames) 33

4.6 Average PSNR for Foreman Video (30 frames) 33 4.7 Average Point for Coastguard Video (30 frames) 34

4.8 Average PSNR for Coastguard Video (30 frames) 34

4.9 Average Point for News Video (30 frames) 35

4.10 Average PSNR for News Video (30 frames) 35

4.11 Average Point for Tennis Video (30 frames) 36


(14)

xiv 4.13 Average Point for Akiyo Video (100 frames) 38

4.14 Average PSNR for Akiyo Video (100 frames) 39

4.15 Average Point for Salesman Video (100 frames) 39

4.16 Average PSNR for Salesman Video (100 frames) 40

4.17 Average PSNR for Foreman Video (100 frames) 40

4.18 Average PSNR for Foreman Video (100 frames) 41

4.19 Average Point for Coastguard Video (100 frames) 41

4.20 Average PSNR for Coastguard Video (100 frames) 42

4.21 Average Point for News Video (100 frames) 42

4.22 Average PSNR for News Video (100 frames) 43

4.23 Average Point for Tennis Video (100 frames) 43

4.24 Average PSNR for Tennis Video (100 frames) 44

4.25 Original image 45

4.26 FS predicted frame 45

4.27 TSS predicted frame 45

4.28 FSS predicted frame 45

4.29 NTSS predicted frame 46


(15)

xv

LIST OF ABBREVIATION

BDM 7 Block Distortion Measure

BMA 7 Block Matching Algorithm

CIF 7 Common Intermediate Format

DS 7 Diamond Search

FS 7 Full Search

FSS 7 Four Step Search

LDSP 7 Large Diamond Search Pattern

MAD 7 Mean Absolute Difference

MATLAB 7 Matrix Laboratory

MBD 7 Minimum Block Distortion

ME 7 Motion Estimation

MPEG 7 Moving Picture Experts Group

MSE 7 Mean Squared Error

NTSS 7 New Three Step Search

PSNR 7 Peak Signal7To7Noise Ratio

QCIF 7 Quarter Common Intermediate Format


(16)

xvi


(17)

xvii

LIST OF APPENDIX

NO TITLE PAGE


(18)

1

CHAPTER I

INTRODUCTION

1.1 PROJECT INTRODUCTION

A technique known as Block Matching Motion Estimation has been widely adopted in various coding standards to achieve high compression ratio in video coding. This technique is implemented conventionally by exhaustively testing all the candidate blocks within the search window. This type of implementation, called Full Search (FS) Algorithm, gives the optimum solution. However, substantial amount of computational workload is required in this algorithm. To overcome this drawback, many fast Block Matching Algorithm (BMAs) have been proposed and developed. Different search patterns and strategies are exploited in these algorithms in order to find the optimum motion vector with minimal number of required search points.

One of these fast BMA’s, which is proposed to be implemented in this project, is called Diamond Search (DS) Algorithm. The student is required to implement the algorithm in MATLAB and then compared its performance to FS algorithm as well as to other fast BMA’s in terms of the peak signal7to7noise ratio (PSNR), number of required search points and computational complexity.


(19)

2 1.2 PROBLEM STATEMENTS

In recent years, several video compression standards had been proposed for different applications such as CCITT H.261, MPEG71 and MPEG72. Generally, video data constitutes most of the multimedia data. Efficient coding of video is important for effectual usage of limited bandwidth and storage medium. Temporal correlation between successive image frames enables high amount of compression. Motion estimation is an important tool for exploiting temporal correlation. Block based motion estimation with non7overlapping rectangular blocks is used in many video coding standards. In this case, image frames are divided into non7overlapping blocks and the best match is searched around a pre7defined search range using all possible positions for each block.

Though this FS method provides optimal quality it significantly suffers from computational load. FS method matches all possible displaced candidate block within the search area in the reference frame in order to find the block with minimum distortion, so this FS algorithm have large motion and more searching point to do the blocks matching and thus the computational may be too complex.

1.3 OBJECTIVES

The main objective of this project is to implement one of the fast BMAs, namely DS algorithm to overcome the problem encountered by FS Algorithm. Besides, the aims are also:

a) To develop and implement DS algorithm in MATLAB

b) To compare and analyze the performance of DS algorithm to FS algorithm as well as other common fast BMAs.


(20)

3 1.4 SCOPES OF WORK

The scopes of works in this project are:

a) Data and theory acquisition on image processing, motion estimation, BMAs and Diamond Search algorithm.

b) Implementation of DS algorithm on MATLAB.

c) Performance comparison of the algorithm to other available BMAs.

1.5 THESIS STRUCTURE

Chapter 1 Introduction

General description on the project idea, clarification on the scope of the project, reviews of problem statement which introduces this project and thus the objectives of doing this project.

Chapter 2 Literature Review

This chapter includes the study on the conventional video coding algorithm and the project video coding algorithm. The algorithms are described. The study includes Full Search algorithm, Diamond Search algorithm, New Three Step Search algorithm and Four Step Search algorithm.

Chapter 3 Methodology

This chapter shows the project planning. The project is divided into nine steps and each step is being described.


(21)

4 Chapter 4 Result and Discussion

This chapter shows presentation of the result obtained and discussion is made base on the result. The result is analyzed and then compared with result from other algorithm.

Chapter 5 Conclusion and Suggestion

This chapter gives an overall comment on the project and any suggestion to upgrade the project is given.


(22)

5

CHAPTER II

LITERATURE REVIEW

2.1 VIDEO CODING

Video compression is the reduction of amount of data or frame which are used to carry visual images. During video transmission, the important element is the fast transmission of video and at the same time, the quality of the video remains good.

Video is a form of sequence of images that are play at a rate. Among two consequences sequences, there could be a lot pixels which are remain unchanged and thus they are redundant and can be eliminated in order for faster data transmission. By identifying the difference of the pixel between the two frames, the video can be reconstructed at the receiver by just sending the differences from the transmitter.

Nowadays, most of the video are digital video. File size is an important concern because digital video files tend to take up a lot of storage space on the hard drive. By compressing the video, it is made easier to be stored.

Digital video can be compressed without impacting the perceived quality of the final product because it affects only the parts of the video that humans may not really detect.


(23)

6

The compressed video can effectively reduce the bandwidth required and thus its application include to transmit video via terrestrial broadcast, via cable TV, or via satellite TV services [1].

Video compression typically operates on square7shaped groups of neighboring pixels, often called macroblocks. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec (encode/decode scheme) sends only the differences within those blocks.

2.2 MOTION ESTIMATION

Motion estimation is the process of determining motion vectors that describe the transformation from one two dimensional image to another; usually from adjacent frames in a video sequence. The idea of motion estimation based video compression is to save on bits by sending encoded difference images which inherently have less energy and can be highly compressed as compared to sending a full frame.

The motion in the current frame is estimated with respect to a previous frame. Motion information is used in video compression to find best matching block in reference frame to calculate low energy residue.

The aim is to obtain motion vector which may relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. This technique eliminates the temporal redundancy due to high correlation between consecutive frames.

Motion estimation is a process of analyzing previous frames and next frames to identify blocks that have not changed or have moved location. The motion estimation module will create a model for the current frame by modifying the reference frames such that it is a very close match to the current frame. This estimated current frame is then motion compensated and the compensated residual image is then encoded and transmitted. Its application includes scan rate conversion


(24)

7

to generate temporally interpolated frames. It is also used in applications such motion compensated de7interlacing, video stabilization, motion tracking etc.

2.3 BLOCK MATCHING ALGORITHM (BMA)

Successive video frames may contain the same objects (stationary or moving). Motion estimation examines the movement of objects in an image sequence to try to obtain motion vectors representing the estimated motion. Motion compensation uses the knowledge of object motion so obtained to achieve data compression.

In real video scenes, motion can be a complex combination of translation and rotation. Such motion is difficult to be estimated and may require large quantity of processing. However, translational motion is easily estimated and has been used successfully for motion compensated coding.

Block matching estimation algorithm assumes the objects are rigid and move in the translational movement for at least a few frame and occlusion of one object by another and with an uncovered background which is neglected

BMA is the block7based search technique and the idea behind BMA is to divide the current frame into a matrix of macro blocks that are then compared with corresponding block and its adjacent neighbors in the previous frame to create a vector that stipulates the movement of a macro block from one location to another in the previous frame.

This movement calculated for all the macro blocks comprising a frame, constitutes the motion estimated in the current frame. A search window with size equal to the rectangular block is placed on those equally divided block to find out the displacement of the best matched block from previous frame as the motion vector to the block in the current frame. Usually the macro block is taken as a square of side 16 pixels.


(1)

1.2 PROBLEM STATEMENTS

In recent years, several video compression standards had been proposed for different applications such as CCITT H.261, MPEG71 and MPEG72. Generally, video data constitutes most of the multimedia data. Efficient coding of video is important for effectual usage of limited bandwidth and storage medium. Temporal correlation between successive image frames enables high amount of compression. Motion estimation is an important tool for exploiting temporal correlation. Block based motion estimation with non7overlapping rectangular blocks is used in many video coding standards. In this case, image frames are divided into non7overlapping blocks and the best match is searched around a pre7defined search range using all possible positions for each block.

Though this FS method provides optimal quality it significantly suffers from computational load. FS method matches all possible displaced candidate block within the search area in the reference frame in order to find the block with minimum distortion, so this FS algorithm have large motion and more searching point to do the blocks matching and thus the computational may be too complex.

1.3 OBJECTIVES

The main objective of this project is to implement one of the fast BMAs, namely DS algorithm to overcome the problem encountered by FS Algorithm. Besides, the aims are also:

a) To develop and implement DS algorithm in MATLAB

b) To compare and analyze the performance of DS algorithm to FS algorithm as

well as other common fast BMAs.


(2)

1.4 SCOPES OF WORK

The scopes of works in this project are:

a) Data and theory acquisition on image processing, motion estimation, BMAs and Diamond Search algorithm.

b) Implementation of DS algorithm on MATLAB.

c) Performance comparison of the algorithm to other available BMAs.

1.5 THESIS STRUCTURE

Chapter 1 Introduction

General description on the project idea, clarification on the scope of the project, reviews of problem statement which introduces this project and thus the objectives of doing this project.

Chapter 2 Literature Review

This chapter includes the study on the conventional video coding algorithm and the project video coding algorithm. The algorithms are described. The study includes Full Search algorithm, Diamond Search algorithm, New Three Step Search algorithm and Four Step Search algorithm.

Chapter 3 Methodology

This chapter shows the project planning. The project is divided into nine steps and each step is being described.


(3)

Chapter 4 Result and Discussion

This chapter shows presentation of the result obtained and discussion is made base on the result. The result is analyzed and then compared with result from other algorithm.

Chapter 5 Conclusion and Suggestion

This chapter gives an overall comment on the project and any suggestion to upgrade the project is given.


(4)

CHAPTER II

LITERATURE REVIEW

2.1 VIDEO CODING

Video compression is the reduction of amount of data or frame which are used to carry visual images. During video transmission, the important element is the fast transmission of video and at the same time, the quality of the video remains good.

Video is a form of sequence of images that are play at a rate. Among two consequences sequences, there could be a lot pixels which are remain unchanged and thus they are redundant and can be eliminated in order for faster data transmission. By identifying the difference of the pixel between the two frames, the video can be reconstructed at the receiver by just sending the differences from the transmitter.

Nowadays, most of the video are digital video. File size is an important concern because digital video files tend to take up a lot of storage space on the hard drive. By compressing the video, it is made easier to be stored.

Digital video can be compressed without impacting the perceived quality of the final product because it affects only the parts of the video that humans may not really detect.


(5)

The compressed video can effectively reduce the bandwidth required and thus its application include to transmit video via terrestrial broadcast, via cable TV, or via satellite TV services [1].

Video compression typically operates on square7shaped groups of neighboring pixels, often called macroblocks. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec (encode/decode scheme) sends only the differences within those blocks.

2.2 MOTION ESTIMATION

Motion estimation is the process of determining motion vectors that describe the transformation from one two dimensional image to another; usually from adjacent frames in a video sequence. The idea of motion estimation based video compression is to save on bits by sending encoded difference images which inherently have less energy and can be highly compressed as compared to sending a full frame.

The motion in the current frame is estimated with respect to a previous frame. Motion information is used in video compression to find best matching block in reference frame to calculate low energy residue.

The aim is to obtain motion vector which may relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. This technique eliminates the temporal redundancy due to high correlation between consecutive frames.

Motion estimation is a process of analyzing previous frames and next frames to identify blocks that have not changed or have moved location. The motion estimation module will create a model for the current frame by modifying the reference frames such that it is a very close match to the current frame. This estimated current frame is then motion compensated and the compensated residual image is then encoded and transmitted. Its application includes scan rate conversion


(6)

to generate temporally interpolated frames. It is also used in applications such motion compensated de7interlacing, video stabilization, motion tracking etc.

2.3 BLOCK MATCHING ALGORITHM (BMA)

Successive video frames may contain the same objects (stationary or moving). Motion estimation examines the movement of objects in an image sequence to try to obtain motion vectors representing the estimated motion. Motion compensation uses the knowledge of object motion so obtained to achieve data compression.

In real video scenes, motion can be a complex combination of translation and rotation. Such motion is difficult to be estimated and may require large quantity of processing. However, translational motion is easily estimated and has been used successfully for motion compensated coding.

Block matching estimation algorithm assumes the objects are rigid and move in the translational movement for at least a few frame and occlusion of one object by another and with an uncovered background which is neglected

BMA is the block7based search technique and the idea behind BMA is to divide the current frame into a matrix of macro blocks that are then compared with corresponding block and its adjacent neighbors in the previous frame to create a vector that stipulates the movement of a macro block from one location to another in the previous frame.

This movement calculated for all the macro blocks comprising a frame, constitutes the motion estimated in the current frame. A search window with size equal to the rectangular block is placed on those equally divided block to find out the displacement of the best matched block from previous frame as the motion vector to the block in the current frame. Usually the macro block is taken as a square of side 16 pixels.