Conceptual development of resources discovery in the proposed hybrid P2P video streaming.

CONCEPTUAL DEVELOPMENT OF RESOURCES DISCOVERY IN THE
PROPOSED HYBRID P2P VIDEO STREAMING

PROFESOR DR. NANNA SURYANA
Paper Published in Journal Of Advanced Manufacturing Technology
Special Edition lnternaHonal Conference On Engineering And lCT {ICEl 2007)
Vol. 1, No. 1, Nov.-Dec. 2007

UNIVERSITI TEKNIKAL MALAYSIA MELAKA

© Universiti Teknikal Malaysia Melaka

CONCEPTUAL DEVELOPMENT OF RESOURCES DISCOVERY IN
THE PROPOSED HYBRID P2P VIDEO STREAMING
Tajuddin, A.S., 1 and Nanna, S.H. 2
1

Security, Intelligent Application and Multimedia Program
TM Research and Development, Serdang, Malaysia and
2
Faculty Information Technology and Communication (ITMK)

Technical University of Malaysia Melaka (UTeM)
Melaka, Malaysia
2
[email protected]

ABSTRACT
We present the design of a hybrid Peer-to-Peer (P2P) system for video streaming.
In this paper, we address the availability, accessibility and lookup service of
files. We use the advantages of server-client business model to search and retrieve
the information. We implement the base ontology of video domain repository
so that the final result may be different and provide more results from the keyword
search. To provide the dynamic standby peer, we use checksum value as an
indicator to search an identical content in the Peer-to-Peer network. We
hypothesize that, by using server-client searching in Peer-to-Peer application,
we can reduce the latency lookup services, path length, peer load and network
traffic.
Keywords: hybrid Peer-to-Peer, lookup service, information retrieval, multimedia
distribution

1.0 INTRODUCTION

File sharing becomes more trendy when peer-to-peer (P2P) application continues
to emerge and give an option for the community to share their content, such as
BitTorrent [5], eDonkey2000 (11], Gnutella [12], FreeNet [14], KaZaA [17] and
Napster [28]. Nowadays, the P2P technology is not only used for downloading but
also for streaming, and many researchers have tried to implement video streaming
in P2P environment, such as ZIGZAG [I 0], SplitStream [25], CollectCast (27],
GnuStream [35] and CELL [36]
In our solution, we designed a P2P service for media streaming. The streaming
session is a unicast session and consists of one sendert? a single receiver peer,
similar to a video on demand application. The only difference is that peers act

ISSN :1985-3157 Vol I No.I Nov.-Dec. 2007

© Universiti Teknikal Malaysia Melaka

97

Journal Advanced Manufacturing Technology

both as a server and a client, thus, eliminates the use of dedicated video server to

manage and stream the video. To backup the sender, we proposed the dynamic
standby peer to replace the failure active sender. In our previous paper [ 1], we have
presented a multiple distributed server on top of structured overlay network topology,
which is composed of one nucleus supervisor and one or many child supervisor(s).
lbis supervised peer will manage peers, and the peer topology itself is an unstructured
overlay network topology. We rank the peer from bad peer to good peer. We believe
that, the selection of the best peer to serve a streaming session is vital in providing
good quality of environment for video streaming. The objective of this paper is to
preserve the quality of the P2P video streaming application by:
i.
Designing the topology of overlay for dynamic servers and peers;
u.
Improving the availability of files, lookup service and control delegation;
and
m.
Maintaining a good real-time playback to requesting peer.
This paper is organized as follows: In section 2, we discuss the Related Work.
Discussions on existing works particularly on availability, accessibility, searching
and retrieval of the information will be presented. Section 3, the Research Design,
discusses on how the information of file will be pushed into supervised peer, queries

and matching, and the ontology connected to video realm. In section 4, we briefly
discuss our simulation finding on path length for file availability and accessibility in
different P2P model such as proposed design, Gnutella, BitTorrent, Chord and
FreeNet. Finally, the conclusion of this paper is presented in Section 5.

2.0 RELATEDWORK
Decentralized or pure P2P system, such as Pastry [2] , Tapestry [4], Viceroy [7],
Chord [ 15], Kademlia [30], CAN [33], Gnutella, KaZaA, CollectCast, FreeNet
and CELL does not have a server to provide a location of data, whereas centralized
P2P system, such asNapster, BitTorrent, eDonkey2000 and SPON [6],[18] uses a
server as an advantage tool to locate data. CAN, Chord, Tapestry, Pastry, Kademlia,
Viceroy and CollectCast offer Distributed Hash Table (DHT), in which DHT can
guarantee to locate data within their overlay network topology. The application
performs a query to match a key and/or NodeID over DHT routing information.
Table 1 shows the various P2P systems which have different routing journey paths
for lookup services.

98

ISSN: 1985-3157 Vol I No. I Nov.-Dec. 2007


© Universiti Teknikal Malaysia Melaka

Conceptual Development of Resources Discovery Jn The Proposed Hybrid P2p Video Streaming

2.1

Structured Pure P2P Application

Chord is a structured P2P application that is completely decentralized and symmetric,
and it can find data using O(log N). Each Chord node needs "routing" information
about a few other nodes only. It uses DHT in which a Chord node communicates
with other nodes in order to perform a lookup. Each node only needs to know how
to contact its current successor's node on the identifier circle. Queries for a given
identifier could be passed around the circle via these successor pointers using finger
table until they encounter a pair of nodes that overlap with the desired identifier.
The finger table has been generated using a formula as described in ( 1). In the
steady state, in an N-node system, each node maintains information about only
O(log N) other nodes, and resolves all lookups via O(log N) messages to other
nodes.


(n + 2k-1) mod 2m, I d" k d" m

(1)

n = number of nodes in the system; k =key; m =bit identifiers.

FreeNet also implements DHT but of a loosely type. FreeNet uses keys (keywordsigned key and signed-subspace key) and descriptive text strings to identify data
objects. It will search from peer to peer until the requests exceed the Hops-To-Live
limits and reach specified number of result set. If a file is found during a lookup
service, the identified file will be successfully retrieved by the original requ_ester,
and it will cache on sequence upstream requester data-store. If there is not enough
data-store space for the newer data item, it will remove the lowest data item and
insert a new one. Thus, this method will increase the availability and accessibility
of popular files by shortening the path length and results in the time lookup to
decrease rapidly overtime. Unfortunately, non-popular files will have a shorter life
inFreeNet network.
In CollectCast, the authors used Tapestry as a lookup services mechanism. They
modified the Tapestry so that it can return one or more supplier peer. CollectCast
has a delegation control module to select which peer( s) should be able to become

the supplier. They use offer rate and available bandwidth over topology-aware
selection technique to choose the active and standby sender. The problem in this
system is that their standby sender lists are static. They do not have a solution to
overcome the situation when all standby peers leave the network.

ISSN:J985-3157 Voll No.I Nov.-Dec.2007

© Universiti Teknikal Malaysia Melaka

99

Journal Admnced Manufacturing Technology

2.2

Unstructured Pure P2P Application

Gnutella uses flooding technique which requires O(N) steps with a limit Time-ToLive (TTL). Unfortunately, Gnutella cannot guarantee to locate data although its
searching request will saturate the internet network. Gnutella introduced Ultrapeer
[3] to improve the lookup services. The peer itself will become an Ultrapeer

automatically if they meet the minimal constraint, such as not under firewall, suitable
operating system, sufficient uptime, etc. Once they become an Ultrapeer, they should
be able to receive and store the meta-data of shared data, and also process the
incoming query requests from connected peers and another Ultrapeer.
In KaZaA, peers are connected to their Super-Peers [23]. The functionality of
Super-Peer is similar to Ultrapeer. The difference is that the user ofKaZaA can
choose to become the Super-Peer. When a peer joins the network, it tries to connect
to the existing Super-Peer, and starts to push the metadata to Super-Peer.
CELL is principally used to cache any amount of video data, to reduce the search
scope of a video lookup and to enhance the availability of a video through caching
coordination. Supplying and requesting peer will determine whether the requesting
peer should be able to cache some of the segmented video file, and become caching
host. They use Gnutella-like technique to locate data, and the searching will stop
when one caching host is found. From there, by using CacheTable, they can locate
complete sets of video segments.
GnuStream is built on top ofGnutella, and it integrates dynamic peer location and
streaming capacity aggregation. Each GnuStream streaming session is controlled
by the receiver peer and involves a dynamic set of peer senders instead of one fixed
sender. The receiver aggregates streaming bandwidth from the multiple senders,
achieving load distribution and fast reaction to sender capacity and on/off-line status

changes.

2.3

Hybrid P2P Application

In BitTorrent network, central server (i.e. Supernova [32]) will collect and store a
.torrent file. Peers do a search with 0(1) step via server and get a reply which
consists of one file with multiple peers and multiple files with identical content in
one result set. Once a peer selects and downloads .torrent, it als!? gets Universal
Resource Locator of selected tracker. When BitTorrent client extracts' the information
in .torrent, it will start to initiate and download the data segmented of256kb size
from one or multiple peers and at the same time update his activity to tracker. The

100

ISSN: 1985-3157 Vol 1 No. I Nov.-Dec. 2007

Conceptual Development of Resources Discovery In The Proposed Hybrid P2p Video Streaming


functionality of this tracker is to keep track download/upload activity by peers so
that indirectly the tracker knows where to find a new supplier. Downloading process
will be completed after all segmented files are downloaded and integrated into one
file.
N apster is a pioneer in introducing a centralized server. The server will collect all
meta-data and provide a location of data to peers. Meta-data stored at a server with
O(N), where N is the number of peers. The peer will search the file with 0(1) step
via server, and return with possible multiple files with identical content in one result
set. Hence, with the server storing all information abom the shared file, the Napster
network can guarantee to locate data. Basically, eDonkcy2000 follows the business
model of Napster.

Table 1: Routing journey in various P2P systems

I Application

Routing step in
lookup services

Application


Routing step in
lookup services

CAN

O(d.N u)

Gnutella

O(N)

Chord

O(log N)

I GnuStream

Tapestry

O(log 11 N)

CELL

O(NJ
0(1)

I
O(N)

Pastry

O(log,. N)

Napsters

Kademlia

0(log8 N)+ c

BitTorrent

0(1)

Viceroy

O(logN)

eDonkey2000

0(1)

CollectCast

O(lob N)

Proposed Design

O(n)

SPON

0(1)

Legend:
N- number of peers in network
d - number of dimensions
B- base of the chosen peer identifier
b- number of bits
c- small constant
n- number of child supervisor

ISSN:l985-3157 Vol I No.I Nov.-Dec. 2007

IOI

Journal Advanced Manufacturing Technology

2.4

Search Engine

Besides P2P application, we can look at an example in web search engine [31 ],
such as Google and Yahoo! Google has a PageRank system which rates the relevancy
of Web pages to queries based not only on whether they include keywords but also
by how many other relevant pages link to them, whereas Yahoo! uses human editors
to provide metadata about many Web sites, giving the system a way to judge the
relevancy of potential search results.

3.0

RESEARCH DESIGN

We design an unstructured peer overlay network topology over structured distributed
supervised peer. The distributed supervised peer system consists of one nucleus
supervisor and one or many child supervisors. The function of these servers are to
manage the accountability and availability of peers, such as joining/leaving request,
content shared registration, rank predictions, files delegation and meta-data searching,
whereas each peer will manage, interact and stream the content. We design the
system so that we can do a dynamic child supervisor, which means that they can
join or leave any time without degrading the performance of the system (of course,
we still need at least one child supervisor in the system). Each supervisor has it own
database and the nucleus server as well.

3.1

NetworkTopology

Each peer needs to register with the nucleus supervisor for the first time. Once the
registration is done, the nucleus supervisor will push the information (e.g. IP address
and NodeID) about the existing child supervisors( s) to the peer. The peer will create
the Identifier Table to store all information regarding the child supervisor( s). The
nucleus supervisor also alerts all child supervisors about a new peer. Then, the peer
should be able to join the P2P network anytime using the identifier table. If the
login process fails (e.g. timeout or overloaded), it will re-connect to another child
supervisor. Once the peer successfully joins the system, the Transmission Control
Protocol (TCP) connection between the child supervisor and peer will be closed.
The child supervisor will assume that the peer(s) is still online based on the periodic
message received from peer( s). We choose this mechanism to make sure that other
peer(s) can perform join request concurrently by reducing the symptom ofbottleneck.
The child supervisor can also re-direct the join request to another child supervisor
if it is overloaded with peers.
',
When a new child supervisor wants to join the system, the child supervisor will
communicate with the nucleus supervisor, and ask for the identifier table information.
102

ISSN:1985-3157 Vol I No.I Nov.-Dec. 2007

Conceptual Development of Resources Discovery In The Proposed Hybrid P2p Video Streaming

Once the child supervisor peer receives the information, it will create a new identifier
table, and start to request and establish the TCP connection with other child
supervisors, and tell them, "I'm Your Sibling". Each recipient child supervisor will
update his/her identifier table and push the new child supervisor information to
online peer. The peer should be able to insert a new record inside the identifier
table. For those offline peers, they will receive it after logging into the system.
Figure 1. shows three distributed child supervisors S 1, S2 and S3 that have to
manage different peers. S 1 monitors Pl, P2 and P3; S2 monitors PS, P6 and P8;
and S3 monitors P7, P9 and P4. S 1, S2 and S3 are connected to each other so as
to be aware of and replicate updated information about peer( s) to Sn (e.g. bandwidth
and packet loss rate). If one child supervisor leaves the network, then all his peers
have to be re-assigned to another child supervisor. Let say, S 1 leaves the network,
then Pl, P2 and P3 have to find a new closet child supervisor (e.g. S2) to make
sure all new interval information can be received by the nucleus.

Figure 1. Hybrid Peer-to-Peer Overlay Network Topology

Each peer is not connected directly to another peer, and their overlay network
topology is unstructured. When a peer is looking for a file, he/she queries the child
supervisor( s), whether the data is available or not, the child supervisor will respond
if any file is listed, the
with a list of peers that contains the requested file. After エス|セL@
peer can directly download the file from the source. Now, we will continue to
discuss on the availability, searching, matching and location of files in hybrid P2P
video streaming.
ISSN:l985-3157 Vol I No.I Nov.-Dec. 2007

© Universiti Teknikal Malaysia Melaka

103

Journal Advanced Manufacturing Technology

3.2

Availability And Accessibility Of File

There are two policies which the peer should meet before the peer can share his
content. The first policy is that the peer should be allowed to share his files if he/she
is in online mode. The second is the peer can share the contents if the peer has the
current offer rate above 128 kbps.
User needs to login into the network first, and start to push a metadata of his/her
shared files to the child supervisor through TCP. Ifin Yahoo!, human editors provide
the metadata about the web sites, but in our case the user himself/herself should be
able to provide the correct information about his/her shared content. If 。ョケエセゥァ@
changes in his/her contents, then the system should be able to upload an amended
metadata to the child supervisor. When a peer leaves the network, the child supervisor
should be able to delete all metadata related to that peer. Hence, only the metadata
owned by online peers should be visible to other peers. Furthermore, it will increase
the speed of the search in the database, and it satisfies the user's need to watch it
spontaneously after the file is spotted. However, it has to be reminded that this is
streaming, not downloading.
What sort of data in metadata should be pushed to the child supervisor? We propose
to transmit the explicit file name, title, synopsis, checksum and extension of file
such as avi, mpeg, mp3, etc. We will use Md5sum [34] technique to hash video
content to get the checksum value of 32 characters.

3.3

Lookup And Matching

This service attempts to satisfy users' queries primarily by looking for occurrence
on metadata. Metadata that includes keywords are considered good matches.
Lorenzo Thi one in [31] said that, "We have to train ourselves, out of necessity,
to translate our needs into keywords as successfully as we can". The proposed
system can guarantee to locate data from 0(1) to O(n) step via child supervisor,
where n is the number of child supervisors. In theory, the system should be able
to reduce the latency lookup services, search path length and peer load as
compared to other P2P applications as exhibited in Table 1. We have fixed the
number ofrecords that should be answered back to requesting peer, and in this
case, X is used as a fixed number. As we can see in Fig. 2, when the peer
starts to search and send the query to his/her child supervisor, the child supervisor
will match the query with metadata. If the number of record found does not
reach X or no record is found, then the originator child will start to ask other
child supervisors using an identifier table. When the originator child supervisor

104

ISSN:l985-3157 Vol I No.I Nov.-Dec.2007

Conceptual Development of Resources Discovery In The Proposed Hy brid P2p Video Streaming

has the X records or received replies from all child supervisors, the originator
child supervisor will push the result to the requesting peers. Each record should
consist of a title, checksum value, size file, IP address, port number and rank.
All records will be sorted by rank, current offer rate and balance of limitation.
There are many ways to perform retrieval information. We propose to use the exact
and truncation keyword, extension file, checksum value and video/audio base
ontology repository.

3.3.1

Keyword Search

User can choose to use exact word or truncation. The search will retrieve information
that contains the exact keywords used anywhere in a metadata, such as in the file
name, title and synopsis. Truncation in a search is done by using characters (e.g. %
or *) to retrieve several words that begin with the same word. This method is
effective when the words that have different spelling or different suffixes. In other
words, truncation helps to retrieve more results or broaden the search.
Exact word

SELECT*
FROM metadata table
WHERE file_ name=' keyword' OR file_title=' keyword' OR
file_ synopsis=' keyword'
Truncation

SELECT*
FROM metadata table
WHERE file_ name LIKE '%keyword%' OR file_title LIKE
'%keyword%' ORfile_synopsis LIKE '%keyword%'

ISSN : l985-3157 Vol I No. l Nov .-Dec. 2007

© Universiti Teknikal Malaysia Melaka

105

Journal Advanced Manufacturing Technology

Child SUpefVISor
.. check and match
to local ""'tadata

NO


Hセ・イ@

received" reply !