Implementation Of Block Matching Algorithm Based On Hexagon Shape Search Pattern For Motion Estimation In Video Coding Application.

(1)

IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON HEXAGON SHAPE SEARCH PATTERN FOR MOTION ESTIMATION IN

VIDEO CODING APPLICATION

MUHAMMAD GHIFFARI REDHA BIN RAMLI


(2)

IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON HEXAGON SHAPE SEARCH PATTERN FOR MOTION ESTIMATION IN

VIDEO CODING APPLICATION

MUHAMMAD GHIFFARI REDHA BIN RAMLI

This Report Is Submitted In Partial Fulfillment Of Requirements For The Bachelor Degree of Electronic Engineering (Telecommunication Electronics)

Faculty Of Electronic And Computer Engineering Universiti Teknikal Malaysia Melaka


(3)

UNIVERSTI TEKNIKAL MALAYSIA MELAKA

FAKULTI KEJURUTERAAN ELEKTRONIK DAN KEJURUTERAAN KOMPUTER BORANG PENGESAHAN STATUS LAPORAN

PROJEK SARJANA MUDA II

Tajuk Projek :

IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON HEXAGON SHAPE SEARCH PATTERN FOR MOTION ESTIMATION IN

VIDEO CODING APPLICATION Sesi

Pengajian : 0 8 / 1 2

Saya MUHAMMAD GHIFFARI REDHA BIN RAMLI

mengaku membenarkan Laporan Projek Sarjana Muda ini disimpan di Perpustakaan dengan syarat-syarat kegunaan seperti berikut:

1. Laporan adalah hakmilik Universiti Teknikal Malaysia Melaka.

2. Perpustakaan dibenarkan membuat salinan untuk tujuan pengajian sahaja.

3. Perpustakaan dibenarkan membuat salinan laporan ini sebagai bahan pertukaran antara institusi pengajian tinggi.

4. Sila tandakan ( √ ) :

SULIT* *(Mengandungi maklumat yang berdarjah keselamatan atau kepentingan Malaysia seperti yang termaktub di dalam AKTA RAHSIA RASMI 1972)

TERHAD** **(Mengandungi maklumat terhad yang telah ditentukan oleh organisasi/badan di mana penyelidikan dijalankan)

TIDAK TERHAD

Disahkan oleh:

__________________________ ___________________________________

(TANDATANGAN PENULIS) (COP DAN TANDATANGAN PENYELIA)

Tarikh :………


(4)

iii

“I hereby declare that this report is the result of my own work except for quotes as cited in the reference”

Signature : ……….

Author : MUHAMMAD GHIFFARI REDHA BIN RAMLI Date : 15 JUNE 2012


(5)

iv

“I hereby declare that I have read this report and in my opinion this report is sufficient in terms of the scope and quality for the award of Bachelor of Electronic

Engineering (Telecommunication Electronics) With Honours”

Signature : ……….

Supervisor Name : MR REDZUAN BIN ABDUL MANAP


(6)

v


(7)

vi

ACKNOWLEDGEMENT

Alhamdulillah, In the name of Allah, the Most Gracious and the Most Merciful, All praises to Allah for the strengths and His blessing in completing this project. This project would not have been possible without the guidance and the help of several individuals who in one way or another way that contributed and extended their valuable assistance for preparation and completion of this project.

First and foremost,Special appreciation goes to my supervisor, Mr. Redzuan Bin Abdul Manap, for his supervision, encouragement, advice and ideas that have been poured out to me throughout this project.

I also wish to express my appreciation to all my lecturers and my friends, especially Mohd Helmi and Mohd Shahlan for sharing the knowledge and similar research interest and also for their help and encouragement.

Last but not least, my deepest gratitude goes to my beloved parents Mr. Ramli Bin Hamdan and Mrs. Siti Amnah Binti Hamzah for their unfailing encouragement and financial support that they have given to me for over the years and also to all my sisters and brothers for their endless love, prayers and encouragement. To those who indirectly contributed in this project, your kindness means a lot to me. Thank you very much.


(8)

vii

ABSTRACT

Video commonly refers to transmission and storage formats for moving pictures. Different types of video produced different image quality and video size. Those different characteristic can be determined based on the compression technique. In compression process, there is one important module called block motion estimation. Block matching algorithm (BMA) is one way in motion estimation (ME) for finding minimum motion vector (MV). BMA consists of Full Search (FS) algorithm and Fast search algorithm. Conventionally FS algorithm has been used for BMA where it exhaustively evaluates all possible blocks over the determined search window to find the best match. However, this method has high computational cost which makes encoding process slower and it is not practical for ‘power-limited’ device. In order to increase speed during encoding, many fast search algorithm has been implemented. In this project hexagon based algorithm which is one of the fast search algorithms is chosen to be implemented using MATLAB software.


(9)

viii

ABSTRAK

Video lazimnya merujuk kepada format penghantaran dan penyimpanan untuk imej bergerak. Jenis video yang berlainan akan menghasilkan kualiti imej yang berbeza serta saiz video yang berbeza. Perbezaan ini dapat dikenal pasti melalui jenis pemampatan yang digunakan. Dalam proses pemampatan, terdapat satu modul penting yang dipanggil anggaran gerakan blok. Algoritma penyamaan blok (BMA) adalah satu kaedah digunakan dalam anggaran gerakan (ME) bagi mencari vector gerakan (MV) yang paling rendah. BMA terbahagi kepada algoritma pencarian penuh (FS) dan algoritma pencarian pantas. Secara konvensionalnya algoritma FS digunakan dalam BMA dimana ianya menilai secara menyeluruh semua blok yang berada didalam tetingkap carian yang ditentukan bagi mencari padanan yang terbaik. Namun begitu kaedah ini mempunyai kos pengiraan yang tinggi seterusnya menjadikan proses pengekodan lebih perlahan dan ia tidak praktikal untuk digunakan dengan peranti 'kuasa-terhad'. Dalam usaha untuk meningkatkan kelajuan semasa pengekodan, banyak algoritma carian pantas telah dibangunkan. Dalam projek ini algoritma heksagon yang merupakan salah satu algoritma carian pantas telah dipilih untuk dilaksanakan dengan menggunakan perisian MATLAB.


(10)

ix

CONTENTS

CHAPTER TITLE PAGE

PROJECT TITLE i

REPORT STATUS VERIFICATION FORM ii

STUDENT’S DECLARATION Iii

SUPERVISOR’S DECLARATION iv

DEDICATION v

ACKNOWLEDGEMENT vi

ABSTRACT……… vii

ABSTRAK………... viii

CONTENT……….. ix

LIST OF TABLES………. xii

LIST OF FIGURES .……….. xiii

LIST OF ABBREVIATION……….. xv

1.0 INTRODUCTION 1.1 Project backgrounds... 1

1.2 Problem statement……… 2

1.3 Objective……….. 2

1.4 Scope of work……….. 3


(11)

x

2.0 LITERATURE REVIEW

2.1 Video Compression……….. 5

2.1.1 The Need of Compression………... 6

2.1.2 Compression Standard……… 7

2.1.3 Video Compression Technique…………... 10

2.2 Motion Estimation……… 12

2.3 Block Matching Algorithm……….. 13

2.4 Full search BMA………. 15

2.5 Diamond search algorithm………... 16

2.6 Cross Diamond Search Algorithm……… 18

2.7 Audio Video Interleave……… 19

2.8 MATLAB Software………. 20

3.0 PROJECT METHODOLOGY 3.1 Literature review……….. 22

3.2 Uploading video sequence……… 23

3.3 Extraction of frame……….. 23

3.4 Block construction……… 23

3.5 Implementation of HEXBS Algorithm into the block.. 24

3.6 Construction of predicted frame……….. 24

3.7 Performance analysis……… 24

4.0 HEXAGON BASED SEARCH ALGORITHM 4.1 Introduction………. 25

4.2 HEXBS pattern………. 26

4.3 HEXBS algorithm development………... 26

5.0 RESULTS AND DISCUSSIONS 5.1 Video sequences.……….. 29


(12)

xi

5.2 PSNR comparison …………..………. 29

5.3 Search point and processing time comparison.……… 36

5.2 Visual comparison………...………. 42

6.0 CONCLUSION……….. 46

REFERENCES……….. 48


(13)

xii

LIST OF TABLE

TABLE TITLE PAGE

5.1 PSNR comparison (dB) 31

5.2 Average number search point comparison. 36


(14)

xiii

LIST OF FIGURE

FIGURE TITLE PAGE

2.1 Video sequence 10

2.2 Comparison between current and reference frame 11

2.3 Prediction error 11

2.4 Frame in search window 14

2.5 Diamond search pattern 17

2.6 LDSP and SDSP pattern 17

2.7 Cross and diamond search pattern 18

4.1 HEXBS pattern 26

4.2 HEXBS development 27

5.1 PSNR (dB) vs frame number for Akiyo. 32

5.2 PSNR (dB) vs frame number for News. 32

5.3 PSNR (dB) vs frame number for Salesman. 33

5.4 PSNR (dB) vs frame number for Coastguard. 34

5.5 PSNR (dB) vs frame number for Tennis. 35

5.6 PSNR (dB) vs frame number for Foreman. 35

5.7 Frame number vs Search point for Akiyo. 38

5.8 Frame number vs Search point for News. 38

5.9 Frame number vs Search point for Salesman. 39 5.10 Frame number vs Search point for Coastguard. 40


(15)

xiv

5.12 Frame number vs Search point for Foreman. 41

5.13 Original frame for “Akiyo” 42

5.14 Predicted frame for “Akiyo” 42

5.15 Original frame for “News” 43

5.16 Predicted frame for “News” 43

5.17 Original frame for “Salesman” 44

5.18 Predicted frame for “Salesman” 44

5.19 Original frame for “Coastguard” 44

5.20 Predicted frame for “Coastguard” 44

5.21 Original frame for “Tennis” 45

5.22 Predicted frame for “Tennis” 45

5.23 Original frame for “ Foreman ” 45


(16)

xv

LIST OF ABBREVIATIONS

ATM - Asynchronous Time Multiplexing AVI - Audio Video Interleave

BMA - Block Matching Algorithm

CCB Cross-Center Biased

CDS - Cross-Diamond Search

CIF - Common Intermediate Format CSP - Cross-shaped Pattern

DCT - Discrete Cosine Transform

DS - Diamond Search

DVD - Digital Versatile Disk

FS - Full Search

HDTV - High-definition Television HEXBS - Hexagon based search algorithm ISDN - Integrated Services Digital Network ISO International Organization Standardization ITU - International Telecommunication Union JPEG - Joint Photographic Experts Group

LAN - Local Area Network

LDSP - Large Diamond Search Pattern MAD - Mean Absolute Difference MBD - Minimum Block Distortion


(17)

xvi

ME - Motion Estimation

MPEG - Moving Picture Experts Group

MSE - Mean Squared Error

MV - Motion Vector

NCCF - Normalized Cross-Correlation Function PSNR - Peak-to-Noise-Ratio

QCIF - Quarter Common Intermediate Format SAD - Sum of Absolute Difference

SDSP - Small Diamond Search Pattern


(18)

1

CHAPTER 1

INTRODUCTION

1.1 Project background

To achieve high compression ratio in video coding, a technique known as Block Matching Motion Estimation has been widely adopted in various coding standards. This technique is implemented conventionally by exhaustively testing all the candidate blocks within the search window. This type of implementation, called Full Search (FS) Algorithm, gives the optimum solution. However, substantial amount of computational workload is required in this algorithm. To overcome this drawback, many fast Block Matching Algorithms (BMA‘s) have been proposed and developed. Different search patterns and strategies are exploited in these algorithms in order to find the optimum Motion Vector (MV) with minimal number of required search points.

One of these fast BMA‘s, which is proposed to be implemented in this project, is called Hexagon Based Search (HEXBS) Algorithm. The algorithm is to be stimulated in MATLAB and its corresponding performance results to be compared to FS algorithm as


(19)

2

well as to other fast BMA‘s in terms of the peak signal-to-noise ratio (PSNR) produced, number of search points required and computational complexity.

1.2 Problem statement

Implementation of FS algorithm in motion estimation (ME) process required substantial amount of computational workload. However this drawback can be overcome by many types of fast BMA‘s which have been proposed and developed recently. Different search patterns and strategies are exploited in these fast BMA algorithms in order to find the optimum MV with minimal number of required search point.

1.3 Objective

Apart from solving FS algorithm weaknesses in ME process, the other objectives of this project are:

i) To develop, implement and analyze HEXBS algorithm by using MATLAB. ii) To compare the performance of HEXBS algorithm to FS algorithm as well as

other common fast BMA’s.


(20)

3

1.4 Scope of work

The project has 3 distinct scopes which are:

I. Background Study

Further reading on video or image compression, ME, BMA and HEXBS Algorithm to gain understanding and sound knowledge on field interest.

II. Algorithm Development

The HEXBS algorithm is implemented and simulated using MATLAB.

III. Perfomance Analysis

The perfomance of HEXBS algorithm is compared with existing fast block matching motion estimation algorithms to see the suitability of the alforithm for ME process in video coding techniques.

1.5 Report structure

The first chapter in this report is introduction. I will have a few sub-topics, which are project background, problem statement, objectives and scope of work. In project background brief discussion on video compression, ME, BMA and HEXBS algorithm is presented. Next, problems of FS algorithm which give the reason for work in this project are explained. Consequently, it leads to the purpose of this. In order to achieve the objectives of this project, scope of work has been determined and described too.


(21)

4

Second chapter in this report is literature review. This chapter is divided into a few of sub chapter that contain fundamental information related to the video compression, ME, BMA and several type of fast BMA available as well as video format and software tool used in this project.

Next chapter is project methodology. This part explains about guidelines and how for this project where it mentioned in detail steps that should be taken in order to complete the project. Project starts with literature review before implementation of HEXBS algorithm in MATLAB and conclude with performance analysis.

The fourth chapter focusing on HEXBS algorithm. This chapter will explain introduction of HEXBS, search pattern and development steps of HEXBS algorithm in much more details.

The fifth chapter is the result and discussion. The simulation results of HEXBS algorithm obtained in this project is presented and analyzed in comparison to FS algorithm and other fast BMA’s.

The final chapter in this report is conclusion. The author concludes the overall project work and put recommendations for future work.


(22)

5

CHAPTER 2

LITERATURE REVIEW

2.1 Video Compression

The volume of information required to represent a high quality Digital Video is usually very large, which becomes a challenge for Internet distribution of high volumes of content. This is basically why the data must be compressed. Uncompressed footage from a camcorder takes about 17MB per second of video. Because it takes so much space, video must be compressed before it is put on the web [1]. Compressions mean that the size of data has been reducing in order to save space or transmission time. Compression can be either lossy compression or lossless compression.


(23)

6

Lossy compression means that the compressed file has less data in it than the original file. In some cases this translates to lower quality files, because information has been lost. However, at the time the difference started to be noticed a relatively large amount of data have been lost. Lossy compression makes up for the loss in quality by producing comparatively small files. For example, digital versatile disk (DVDs) are compressed using the MPEG-2 format, which can make files 15 to 30 times smaller, but people still tend to perceive DVDs as having high-quality picture.

Lossless compression is exactly what it sounds like, compression where none of the information is lost. This is not nearly as useful because files often end up being the same size as they were before compression. This may seem pointless, as reducing the file size is the primary goal of compression. However, if file size is not an issue, using lossless compression will result in a perfect-quality picture. For example, a video editor transferring files from one computer to another using a hard drive might choose to use lossless compression to preserve quality while he or she is working.

2.1.1 The Need of Compression

A single digital television signal in CCIR 601 format requires a transmission rate of 216 Mbps. This bit rate is too high for most existing practical communication networks. For example, most local area networks (LANs) offer data transmission at rates on the order of 10 Mbps, and most wide area networks (WANs) support much lower data rates than this. The emerging Asynchronous Time Multiplexing (ATM) networks are capable of transmitting higher bit rates. However, distributing an uncompressed CCIR 601 bit stream over these networks is still prohibitively expensive [2].


(24)

7

This shows that the digital video information should be compressed or encoded prior to transmission to accommodate for different transmission media‘s capabilities. At the receiver‘s end, the compressed bit stream received is first decompressed or decoded and then displayed. A number of video coding techniques and standards have been developed within the last few years that exploit the inherent redundancy in still images and moving video sequences to provide significant data compression.

2.1.2 Compression Standard

The Moving Picture Experts Group (MPEG) is a working group of experts that was formed by ISO and IEC to set standards for audio and video compression and transmission. It was established in 1988 by the initiative of Hiroshi Yasuda (Nippon Telegraph and Telephone) and Leonardo Chiariglione. The MPEG standards consist of different Parts. They include [3]:

i. MPEG-1

Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures and audio. This was based on CD-ROM video applications, and is a popular standard for video on the Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard for digital compression of audio known as MP3. MPEG-1 is the standard of compression for VideoCD, the most popular video distribution format thoughout much of Asia.

ii. MPEG-2

Designed for between 1.5 and 15 Mbit/sec Standard on which Digital Television set top boxes and DVD compression is based. It is based on MPEG-1, but designed for the compression and transmission of digital broadcast television.


(1)

number of search points required and computational complexity.

1.2 Problem statement

Implementation of FS algorithm in motion estimation (ME) process required substantial amount of computational workload. However this drawback can be overcome by many types of fast BMA‘s which have been proposed and developed recently. Different search patterns and strategies are exploited in these fast BMA algorithms in order to find the optimum MV with minimal number of required search point.

1.3 Objective

Apart from solving FS algorithm weaknesses in ME process, the other objectives of this project are:

i) To develop, implement and analyze HEXBS algorithm by using MATLAB. ii) To compare the performance of HEXBS algorithm to FS algorithm as well as

other common fast BMA’s.


(2)

1.4 Scope of work

The project has 3 distinct scopes which are:

I. Background Study

Further reading on video or image compression, ME, BMA and HEXBS Algorithm to gain understanding and sound knowledge on field interest.

II. Algorithm Development

The HEXBS algorithm is implemented and simulated using MATLAB.

III. Perfomance Analysis

The perfomance of HEXBS algorithm is compared with existing fast block matching motion estimation algorithms to see the suitability of the alforithm for ME process in video coding techniques.

1.5 Report structure

The first chapter in this report is introduction. I will have a few sub-topics, which are project background, problem statement, objectives and scope of work. In project background brief discussion on video compression, ME, BMA and HEXBS algorithm is presented. Next, problems of FS algorithm which give the reason for work in this project are explained. Consequently, it leads to the purpose of this. In order to achieve the objectives of this project, scope of work has been determined and described too.


(3)

few of sub chapter that contain fundamental information related to the video compression, ME, BMA and several type of fast BMA available as well as video format and software tool used in this project.

Next chapter is project methodology. This part explains about guidelines and how for this project where it mentioned in detail steps that should be taken in order to complete the project. Project starts with literature review before implementation of HEXBS algorithm in MATLAB and conclude with performance analysis.

The fourth chapter focusing on HEXBS algorithm. This chapter will explain introduction of HEXBS, search pattern and development steps of HEXBS algorithm in much more details.

The fifth chapter is the result and discussion. The simulation results of HEXBS algorithm obtained in this project is presented and analyzed in comparison to FS algorithm and other fast BMA’s.

The final chapter in this report is conclusion. The author concludes the overall project work and put recommendations for future work.


(4)

CHAPTER 2

LITERATURE REVIEW

2.1 Video Compression

The volume of information required to represent a high quality Digital Video is usually very large, which becomes a challenge for Internet distribution of high volumes of content. This is basically why the data must be compressed. Uncompressed footage from a camcorder takes about 17MB per second of video. Because it takes so much space, video must be compressed before it is put on the web [1]. Compressions mean that the size of data has been reducing in order to save space or transmission time. Compression can be either lossy compression or lossless compression.


(5)

original file. In some cases this translates to lower quality files, because information has been lost. However, at the time the difference started to be noticed a relatively large amount of data have been lost. Lossy compression makes up for the loss in quality by producing comparatively small files. For example, digital versatile disk (DVDs) are compressed using the MPEG-2 format, which can make files 15 to 30 times smaller, but people still tend to perceive DVDs as having high-quality picture.

Lossless compression is exactly what it sounds like, compression where none of the information is lost. This is not nearly as useful because files often end up being the same size as they were before compression. This may seem pointless, as reducing the file size is the primary goal of compression. However, if file size is not an issue, using lossless compression will result in a perfect-quality picture. For example, a video editor transferring files from one computer to another using a hard drive might choose to use lossless compression to preserve quality while he or she is working.

2.1.1 The Need of Compression

A single digital television signal in CCIR 601 format requires a transmission rate of 216 Mbps. This bit rate is too high for most existing practical communication networks. For example, most local area networks (LANs) offer data transmission at rates on the order of 10 Mbps, and most wide area networks (WANs) support much lower data rates than this. The emerging Asynchronous Time Multiplexing (ATM) networks are capable of transmitting higher bit rates. However, distributing an uncompressed CCIR 601 bit stream over these networks is still prohibitively expensive [2].


(6)

This shows that the digital video information should be compressed or encoded prior to transmission to accommodate for different transmission media‘s capabilities. At the receiver‘s end, the compressed bit stream received is first decompressed or decoded and then displayed. A number of video coding techniques and standards have been developed within the last few years that exploit the inherent redundancy in still images and moving video sequences to provide significant data compression.

2.1.2 Compression Standard

The Moving Picture Experts Group (MPEG) is a working group of experts that was formed by ISO and IEC to set standards for audio and video compression and transmission. It was established in 1988 by the initiative of Hiroshi Yasuda (Nippon Telegraph and Telephone) and Leonardo Chiariglione. The MPEG standards consist of different Parts. They include [3]:

i. MPEG-1

Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures and audio. This was based on CD-ROM video applications, and is a popular standard for video on the Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard for digital compression of audio known as MP3. MPEG-1 is the standard of compression for VideoCD, the most popular video distribution format thoughout much of Asia.

ii. MPEG-2

Designed for between 1.5 and 15 Mbit/sec Standard on which Digital Television set top boxes and DVD compression is based. It is based on MPEG-1, but designed for the compression and transmission of digital broadcast television.