Dynamic Threshold Based Load Balancing A

1 Introduction Grid computing has recently become one of the most important research topics in the field

of computing. The Grid paradigm has gained popularity due to its capability to offer easier access to geographically distributed resources operating across multiple administrative domains. The grid environment is considered as a combination of dynamic, heterogeneous and shared resources in order to provide faster and reliable access to the Grid resources. For efficient resource management in Grid, the resource overloading must be prevented

which can be obtained by proper Load Balancing mechanisms [ 1 – 3 ].

In the existing work, a dynamic, decentralized Load Balancing technique is proposed that considers all the factors pertaining to the characteristics of the Grid computing environment [ 4 ]. A dynamic threshold value is used at each level and the value of the threshold is dynamically changing according to the Grid size in the network. A well- designed information exchange scheduling scheme is adopted in the proposed technique to enhance the efficiency of the Load Balancing model. A comparative analysis between the existing technique and proposed exhibits why the proposed technique is better than other existing algorithms. The proposed technique is a new version of the LB mechanism based on random policy. The design of the system environment, LB process for random job arrival through Poisson process has been developed and the flowchart of proposed Hier-

archical Load Balancing technique [ 5 – 12 ] along with fitness function has been presented in this paper.

Sharing resources among organizational and institutional boundaries needs an infras- tructure to coordinate resources of boundaries within so called virtual organizations. Grid technology builds the infrastructure for virtual organization. Such infrastructure should offer an easy management of forming virtual organizations, sharing resources, discovering services and consuming services. Grid functionally combines globally distributed com- puters and information systems for creating a universal source of computing power and information. A Grid can offer a resource balancing effect by scheduling and load balancing of Grid jobs at machines with low utilization. A proper load balancing across the Grid can lead to improve overall system performance and a lower turnaround time for individual jobs. This is very crucial concern in distributed environment, to fairly assign jobs to

resources [ 13 – 16 ]. Table 1 shown the different load balancing techniques comparison. The main goal is to distribute the jobs among processors to maximize throughput, maintain stability, and resource utilization. This is achieved by proper load balancing techniques. Load balancing is done basically to do following benefits.

• Load balancing reduces mean job response time under job transfer overhead. • Load balancing increases the performance of each host. • Small jobs will not suffer from starvation.

1.1 Our Contribution Due to the appearance of load balancing challenge in grid computing, there is currently a

need for a hierarchical load balancing algorithm which gets into the account grid archi- tecture, heterogeneity, response time, resource allocation and makespan. Taken altogether, the main contributions and novelties of our research work are as follows:

1. To the best of our knowledge this is a novel technique to tackle hierarchical load balancing [ 2 ] takes into account all the factors pertaining to the characteristics of the

Dynamic Threshold Based Load Balancing Algorithms

Table 1 Difference between technologies Parameter

Technology Cluster

Parallel

Distributed Grid

computing computing Loosely coupled system

computing

computing

Yes Yes Shared memory

No

No

No No Centralized job management and

No

Yes

No No scheduling

Yes

Yes

Resources heterogeneity

Yes Yes Resource dynamicity

No

No

Less More Scalability

Less scalable

Less scalable Highly scalable

Work for a common goal

No No Security

Less Less

grid computing environment mentioned above. This technique is good for back up and support, in case of server failure.

2. The proposed approach eliminates the scalability complexity of a site. Most of the existing load balancing techniques uses a static threshold value that is somehow good for restricted network size but cause problems in a large scale grid environment. It may lead to scalability and response time issues etc. To overcome existing problems, dynamic threshold value has been used in the proposed work. Every time when the application or node has been increased or decrease, threshold value has been calculated. This is easy to use and more supportive with the hierarchical load balancing technique.

3. An efficient load balancing technique is adopted to enhance the efficiency of the load balancing model. We have proposed an extended version of the random policy for selecting the node during load balancing and initiated by under loaded or overloaded node in the pool of shared nodes.

This section gives an overview of all aspects of Grid Computing. It discusses the tech- nologies like distributed computing, cluster computing etc. and various kinds of Grid based services, various types of Grid, components of Grid and benefits of a Grid environment. In

the Sect. 2 preliminaries has been discussed for the proposed Load Balancing algorithms. Related work is discussed in Sect. 3 . The Sect. 4 focuses on the system model, structure of the load balancing model along with the design of the system environment and its appli- cation to the network environment. Then, the problem description, flowchart, fitness functions have been presented for the proposed model. The Load Balancing flowchart and

algorithms have been proposed for the model in the Grid environment describes in Sect. 5 . To understand Load Balancing Technique, a case study has been considered in Sect. 6 .

Performance evaluations and simulation results have been presented in Sect. 7 . Finally, Sect. 8 is dedicated to conclusion and future work.

2 Preliminaries Qureshi and Rehman [ 16 ] had documented Grid architecture for Load Balancing, as illus-

trated in Fig. 1 . They have used, Poisson process for random job arrival with a random computation length. Considering that the jobs are sequenced, mutually independent with the

N. Rathore

Fig. 1 Basic comparison between three techniques arrival rate and can be executed on any site. Furthermore, the site should meet the jobs

demand for the computing resource and the amount of data transmitted. Each processor can only execute one job at a time and execution of a job cannot be interrupted or moved to another process during current execution. This model has been divided into three levels: Level-0 Broker, Level-1 Resource, and Level-2 Machine Level. When a new job known as Gridlet arrives on a machine, it may go to underlightlyloaded, lightly-loaded, overloaded and normal loaded resources by load calculation being computed at each node. In order to compute the mean job response time analysis one Grid Broker (GB) section as a simpli_ed Grid model has been considered. Grid Broker is the top manager of a Grid environment. It is liable for sustaining the overall Grid activities of scheduling and rescheduling. It acquires the infor- mation of the work load from Grid resources and sends the tasks to resources for optimization of load. Resource that comes next to Grid Broker in the hierarchy, are connected through internet. The resource manager is responsible to maintain the scheduling and Load Balancing of its machines and it also sends an event to the Grid broker during overload. The machine is a Processing Entity (PE) manager, responsible for task scheduling and Load Balancing of its PE’s connected with various resources via LAN. PE manager also sends an event to resource during overload. PE’s next to machines are mainly responsible for calculating workload and threshold values for Load Balancing, Job Migration and passes the load information upwards to machines via buses. Gridlet is considered as a load and assigned to any of the PE’s according to their capability (i.e. Computational speed, queue length, etc.).

3 Related Work Many papers have been published to address the problem of load balancing in different

environments such as Grid computing, [ 30 – 34 ], peer-to-peer, distributed etc. Some of the proposed Grid computing load balancing policies is modifications or extensions to the

Dynamic Threshold Based Load Balancing Algorithms

traditional distributed systems load balancing policies. Some of them which have been studied are described from next section and some of them are summarized here.

Maoz et al. [ 17 ] present distributed model of dynamic load balancing in which they reduce migration times and the down-times and Focus on the topology and the physical parameters of the links, the proposed model is compared with MOSIX process migration

and Jobrun’s VM migration. In controlled job migration [ 18 ], the authors proposed Dis- tributed and distance-controlled load balancing technique, which Reduce amount of communication cost. Feature of this technique is to maintain the stability and security by limiting the distance. The limitation of this technique is each node can’t exchange jobs

with all the nodes. Heiss & Schmitz [ 19 ] presented decentralize approach of LB, which Focus on minimizing communication delays, Minimization of communication costs, Avoidance of unproductive migration, and avoidance of oscillations. In paper Distributed

Route Control Schemes [ 20 ] authors proposed Centralized LB model that Handles traffic fluctuations. The main feature of this Centralized routing techniques to distribute the incoming traffic of a multihomed stub network among its various egress links. Razzaqu

et al. [ 21 ] formulated a distributed LB technique which try to Reduces the amount of message transfer between two nodes so as to decrease scheduling decision time to improve the system performance. The major contribution of this includes workload migration technique and dynamic and stable technique to schedule the jobs that requires only 2 (K - 1) messages to decide whether to execute a process locally or remotely. In paper

[ 22 ], the authors proposed a distributed Biased Random Sampling (BRS) technique, in which Network structure can be changed dynamically to efficiently distribute the load. This technique will not require any monitoring mechanism since it is encoded in the network structure. Load-balancing is achieved without the need to monitor the nodes for their

resource availability. Lee et al. [ 23 ] presented is distributed technique that is based on PUSH and PULL job migration algorithm. It reduces average job queue wait times and also avoids starvation of large jobs through the backfilling counter mechanism. In paper [ 24 ] authors proposed Distributed Algorithms for QoS Load Balancing, in this the threshold formulation allows to significantly strengthen the locality constraint by using only infor- mation about the currently allocated resource. The limitation of this paper is that it is not clear whether the bounds presented in this paper are optimal. For instance, it might be the case that a different, more elaborate protocol achieves even doubly logarithmic conver- gence time Users have quality of service (QoS) demands. Saruladha and Santhi in paper

[ 25 ], presents Distributed agent based approach that improves the response time of the user submitted jobs, overall execution time required for the completion of the submitted jobs is found to decrease and also explore and find the under loaded nodes is done more quickly.

In paper [ 26 ], author’s present decentralized LB algorithm, the goal of this algorithm is to allocate the available channels to the agents in a balanced way such that the overall system achieves the best possible performance. Many other channel allocation algorithms require a priori information on whether the channel allocation is done with overlapping or non- overlapping channels, in this approach there is no such issue, as the method is not based on explicit interference models. But this algorithm is strictly locally executed algorithm. In

paper [ 27 ], the authors presented scalable, adaptive, and distributed algorithms for load balancing across resources for data-intensive computations on Grid environments. The objective of this paper is to minimize Average Response Time (ART) and the total exe- cution time for jobs that arrive at a Grid system for processing. Several constraints such as communication delays due to the underlying network, processing delays at the processors, and an arbitrary topology for a Grid system are explicitly considered in the problem formulation. The proposed algorithms are adaptive in the sense that they estimate different

N. Rathore

types of strongly influencing system parameters such as the job arrival rate, processing rate, and load on the processor and use this information for estimating the finish time of a job on a buddy processor. The authors address various issues by proposing two job migration algorithms, which are Modified ELISA (MELISA) and Load Balancing on Arrival (LBA). MELISA, which is applicable to large-scale systems is a modified version

of ELISA [ 28 ] in which consider the job migration cost, resource heterogeneity, and network heterogeneity when load balancing is considered. The LBA algorithm, which is applicable to small-scale systems, performs load balancing by estimating the expected

finish time of a job on buddy processors on each job arrival. The paper [ 29 ] describes a centralized model for dynamic load balancing with Multiple Supporting Nodes in dis- tributed systems. The objective of this algorithm is to reduce the communication delay and traffic up to some extent, and this is obtained because the load is transferred within the cluster itself (from the primary node to supporting node).

4 System Model and Problem Description The work, proposed here, is an enhancement of the work [ 2 , 16 ] discussed above. It

suggests the necessity of quantification of the load in order to achieve Load Balancing in computational grid. Quanti_cation of the load is done and the objective function is derived based on the load distribution of the computational nodes. Response time and resource allocation have been recorded as a fair contribution of this research. Furthermore, it

extends the existing technique into two cases as discussed in Table 2 .

Two cases for the proposed technique have been considered. In first case, lightly loaded node is divided into lightly and under lightly loaded categories in the context of variable threshold and standard deviation. Thus, the nodes are divided into four pools. This tech- nique minimizes the searching time to find out the receiver machine in order to transfer the Gridlets. A well-designed information exchange scheduling scheme is adopted in the proposed technique to enhance the efficiency of the Load Balancing model.

In the second case, load is divided lightly loaded into lightly and under lightly loaded categories and heavily loaded node into heavily and highly heavily loaded categories in the context of variable threshold and standard deviation. Thus, here the nodes are divided into five pools. This technique minimizes the searching time (as compared to the existing technique) but increases the ambiguity in the form of comparison that can increase exponentially.

To find an appropriate node for migration is complex task that can enhance the cost and storage capacity as compared to the existing techniques. Therefore, first case is better than existing and second case in all perspectives. Thus, the first case has been implemented in this thesis work. A comparative analysis depicts the difference between all the techniques and exhibits why the proposed algorithm is better than any other existing algorithms

presented in Fig. 1 .

The proposed technique is thus the enhanced version of the Load Balancing mechanism using random policy. The proposed random LB policy chooses the target endpoint ran- domly from the specified list. The proposed Load Balancing technique has not only increased resource allocation efficiency of Grid resources, but also cuts down the response time of the entire Grid. This algorithm is rigorously examined on the GridSim simulator to judge the effectiveness of the algorithm, especially on Grid platform. The detailed steps and architecture of the proposed technique are explained in the subsequent sections.

Dynamic Threshold Based Load Balancing Algorithms

Table 2 Comparison between three techniques Techniques

Existing technique

Proposed technique (Case-1)

Proposed technique (Case-2)

Categories Divided into three pools

Divided into five pools like: lightly, normally,

Divided into four pools like:

under lightly, lightly, normally, like: under lightly, heavily

heavily

lightly, normally, heavily, highly heavily

Techniques Each heavily loaded

Checking has been done in node has to check all

According to load percentage

overloaded node is checked into all four pools the nodes in the lightly

one pool either lightly or under loaded pool

lightly loaded pool on the basis of sorting

Comparisons 1 2 4 (exponential increase) Scheduling

Yes Sorting

No

Yes

No Yes ascending—under lightly and Yes

lightly loaded pool descending—heavily loaded pool

Policy Not applicable

Random: means process can be

Any executed vice versa also (lightly loaded node can also search heavily loaded node)

Response Low

Medium time

High (know about the status of

each pool and sorted all nodes in order)

Throughput Low

Medium Resource

High (high availability)

Medium allocation

Low

High (due to idea of load

percentage)

capacity Cost

High Time

Low

Medium

High (each node has

Medium (more explored in the pool)

Low (due to sorting technique)

comparisons)

In Grid environments, the shared resources are dynamic in nature, which in turn affects application performance. Workload and resource management are two essential functions provided at the service level of the Grid software infrastructure. To improve the global throughput of these environments, effective and efficient load balancing algorithms are fundamentally important.

Load balancing is the process of redistributing the work load among nodes of the distributed system to improve both resource utilization and job response time while also avoiding a situation where some nodes are heavily loaded while others are idle or doing little work. Static load balancing strategies are not suitable for the environment where resources are heterogeneous and dynamic in nature. In this case dynamic algorithms work well.

A dynamic load balancing algorithm assumes no a priori knowledge about job behavior or the global states of the system, i.e. load balancing decisions are solely based on the current status of the system. Centralized load balancing schemes, the reliance on one central point of balancing control could limit future scalability and not suitable for Grid environment. The distributed scheme helps solve the scalability problems, thus

N. Rathore

development an efficient dynamic distributed load balancing is important to improve the overall performance of Grid.

So to improve overall system performance and to reduce the average job response time and execution time development of an efficient dynamic distributed load balancing tech- nique is very important. The focus of our study is to consider factors which can be used as characteristics for decision making to initiate Load Balancing.

5 Load Balancing Model Proposed load balancing technique is based on hierarchal model. There are three levels in

this hierarchy: Grid-Level, Resource-Level (cluster), and Machine-Level shown in Fig. 2 . It is assumed that the Grid-Level consists of a collection of Clusters connected by a communication network. Each resource/cluster may contain multiple machines and each machine can have more than one Processing Element. The machines in the resource are heterogeneous in nature. The differences may be in the hardware architecture, operating systems, and processing power. In this study, heterogeneity only refers to the processing power of the machine.

The processing power of the Grid cluster is measured by the average CPU speed across all computing nodes within the machine. Each level in this hierarchy (illustrated in Fig. 2 ) is explained in detail. Grid Broker is the top manager of a Grid environment which is responsible for maintaining the overall Grid activities of scheduling and rescheduling.

It gets the information of the work load from Grid resources. It sends the tasks to resources for optimization of load. Resource is next to Grid Broker in the hierarchy. It is responsible for maintaining the scheduling and load balancing of its machines. Also, it sends an event to Grid broker if it is overloaded. Machine is a Processing Element (PE) manager. It is responsible for task scheduling and load balancing of its PEs. Also, it sends an event to resource if it is overloaded.

Grid Broker

Level 0

Resources 1….r

Resource 1 R 2 .......

Level 1

Machine 1….m

Machine 1 ... ... M m M M M … M

PE: 1….p

GL GL GL GL 2 Gridlet: 1….g

Fig. 2 Hierarchical structure of Grid

Dynamic Threshold Based Load Balancing Algorithms

Level 0: Grid Broker (GB) This first level has a virtual node called Grid Broker, which is root of hierarchy. Any GB

manages a pool of Resource Managers (RMs) in its geographical area. The role of GB is to collect information about the active Machine and PEs managed by its corresponding RMs. GBs are also involved in the task allocation and load balancing process in the Grid. Grid Broker performs the following functions:

• It maintains the workload information of the entire hierarchy. • It manages a global load balancing between the Grid resources, for this one algorithm is

called known as a GB-level load balancing. • It sends the load balancing decisions to the resources of level-1 for execution.

Level 1: Resource Manager (RM) Each virtual node of this level, called resource manager, is associated with a physical Grid

cluster. Every RM is responsible for managing a pool of Machine. The role of the RM is to collect information about active processing elements in its pool. The collected information mainly includes CPU speed, and other hardware specifications. Also, any RM has the responsibility of allocating the incoming jobs to any processing element in its pool according to a specified load balancing algorithm. The range for the number of RMs is

considered as from 1 to r as shown in Fig. 5 . In our load balancing strategy, this virtual node is responsible for:

• Maintaining the workload information relating to each one of its machines. • Estimating its associated machines workload, and allocating the incoming jobs to any

machine in its pool. • Managing a local load balancing, for this one algorithm is called this algorithm is

named as a Resource-level load balancing. • Taking decision to invoke GB-level load balancing algorithm.

Level 2: Machine Manager (MM) At this last level, the processing Element of the Grid has been founded which linked to

their respective machines, and these machine linked to their respective resource/cluster.

Any machines can join the Grid system by registering within any RM and offer its computing resources to be used by the Grid users. Each Machine can have more than one PEs and each PEs have some specified CPU speed in terms of Millions Instruction Per Second (MIPS) rating. The ranges for number of MMs in any Resource are considered as from 1 to m. Each machine can have one or more number of PE attached with it and are responsible for actual execution of jobs or Gridlets. Once Gridlet is submitted to Machine, MM will assign this Gridlet to the available PEs. Migration of Gridlet is performing from this level only. The range for number of PEs in any particular Machine is considered as from 1 to p. Each MM is responsible for:

• Maintaining workload information of its associated PEs. • Estimating workload of all PEs. • Managing a local load balancing, for this one algorithm is called this algorithm is

named as Machine-level load balancing. • Deciding whether to invoke Resource-level load balancing algorithm.

The main characteristics of our strategy can be summarized as follows:

N. Rathore

• It gives more preferences to a local task transfer (within a resource or cluster) than global transfer (task transfer between resources or clusters). • It reduces tasks moving across the Grid broker of a Grid architecture;

• It allows performing more than one load balancing operation at the same time. • It supports Grid heterogeneity and scalability. • It is totally independent from any physical Grid architecture.

The flowchart of the proposed Load Balancing technique has been shown in Fig. 3 .

Fig. 3 Flowchart of proposed hierarchical load balancing technique

Dynamic Threshold Based Load Balancing Algorithms

5.1 Proposed Algorithm Our proposed Load Balancing technique is hierarchal based in heterogeneous Grid envi-

ronment. Here heterogeneity in terms of processing capability of each machine. The assumptions has taken that processing capability of each PE is same for single machine. Our proposed algorithm is used sender initiated strategy, which means the machine or resources which want to transfer the Gridlet will search for the under lightly loaded machine or resources.

5.2 Load Balancing Approach Our proposed load balancing approach works at three levels: Broker, Resource, and

Machine Level (Level-0, Level-1, and Level-2 respectively). When a new Gridlets arrives at a machine, it submits it to a PE, which is lightly loaded. Also, after any of the defined four activities will happen, it checks the load of all PEs and classifies them: under lightly loaded, lightly loaded, over loaded, normal.

In our proposed scheme PEs or machines or Resources are categorized into four parts which is under lightly loaded (PE/Machine/Resource), lightly loaded (PE/Machine/Re- source), over loaded (PE/Machine/Resource), normal (PE/Machine/Resource). Each of them is described below:

• Normal Loaded Any PE/Machine/Resource comes under this category if its load is equal to some pre-specified threshold value.

• Over loaded Any PE/Machine/Resource comes under this category if its load is greater than some pre-specified threshold value. Again lightly loaded have been divided into lightly and under lightly loaded, because to

minimize the searching time to find out the receiver machine to transfer the Gridlets (need to check only either in under lightly loaded or lightly loaded machines, which reduces the searching time).

• Lightly loaded Any PE/Machine/Resource comes under this category if its load is [ 50 % of some pre-specified threshold value. • Under Lightly loaded Any PE/Machine/Resource comes under this category if its load is \50 % of some pre-specified threshold value.

The proposed load balancing scheme is simulated on the GridSim.

5.3 Notations Used in Algorithms Notations that are used at PEs level are summarized in Table 3 .

Notations that are used at Machine level are summarized in Table 4 . Notations that are used at Resource level is summarized in Table 5 .

5.4 Formulas Used

Formulas that used in proposed algorithm are in Table 6 .

N. Rathore

Table 3 Notations used at PE level S. no.

Processing element 2 g Number of gridlets on any PE

3 p Number of PEs on any machine 4 PELoad

Load of any PE

5 PEsLoad Total load of available PE in any MACHINE 6 OPEList

List of PE with overloaded Gridlets 7 OP

Size of OPEList (O: Overloaded and P: PE) 8 LPEList

List of PE with lightly loaded gridlets 9 LP

Size of LPEList (L: Lightly loaded and P: PE) 10 ULPEList

List of PE with under lightly loaded gridlets 11 UP

Size of ULPEList (U: Underlightlyloaded and P: PE) 12 NPEList

List of PE with normal task

13 NP Size of NPEList (N: Normal-loaded and P: PE) 14 (PELoad) D Load of processing element D

15 (TaskLoad) L

Load of task L 16 (PELoad) E Load of processing element E

17 Rating actual capacity of PE i.e. CPU speed in terms of MIPS rating 18 (Curr_Load) PE

Current load of PE

Table 4 Notations used at machine level S. no.

Notations

Description

1 M Number of machines on any resource 2 MLoad

Load of any machine

3 MsLoad Total load of machines in any resource 4 OMList

List of machine with overloaded PEs 5 OM

Size of OMList (O: Overloaded and M: Machine) 6 LMList

List of machine with lightly loaded PEs 7 LM

Size of LMList (L: Lightlyloaded and M: Machine) 8 ULMList

List of machine with under lightly loaded PEs 9 UM

Size of ULMList (U: Underlightlyloaded and M: Machine) 10 NMList

List of machine with normal PEs 11 NM

Size of NMList (N: Normal-loaded and M: Machine) 12 (MLoad) D Load of machine D 13 (MLoad) E Load of machine E

5.5 Load Parameter Proposed algorithms have used different load parameter at each level such as PE level load

is denoted by q, machine level load is denoted by g, and the resource level load is denoted by q and Load parameter is given in Table 7 . We have taken this parameter from paper [ 28 ].

Dynamic Threshold Based Load Balancing Algorithms

Table 5 Notations used at resource level S. no.

Notations

Description

Number of resources

2 ORList List of resource with overloaded machines 3 OR

Size of ORList (O: overloaded and R: Resource) 4 LRList

Machine with lightly loaded PEs 5 LR

Size of LRList (L: Lightlyloaded and R: Resource) 6 ULRList

Resource with underloaded machines 7 UR

Size of ULRList (U: Underlightlyloaded and R: Resource) 8 NRList

Resource with normal machines 9 NR

Size of NRList (N: Normal-loaded and R: Resource) 10 RLoad

Load of any resource

Table 6 Formulas

S. no.

Formulae

1 (Curr_Load) PE = P g 1 ð File size Þ g where g is the number of Gridlets 2 PELoad = (Curr_Load) PE /Rating 3 PEsLoad = P p 1 ð PELoad Þ

4 MLoad = PEsLoad/p 5 MsLoad ¼ P m 1 ð MLoad Þ 6 RLoad = MsLoad/m 7 (TaskLoad) g = (File_Size) g /rating

Table 7 Load parameter

Load parameter

Level

q = 0.6

PE level load g= 0.75 Machine level load q= 0.8 Resource level load

Details of these parameters are as follows. In paper [ 28 ], authors use the concept of group and element. Depending on cases, a group designs either a cluster or the Grid (level 1 or level 0 in the tree). An element is a group component (worker node of level 2 or cluster of level 1).

They have considered standard deviation over the workload index, in order to measure the deviation between its involved elements. To consider the heterogeneity between nodes capabilities, they propose to take as workload index the processing time. Processing time of an entity (element or group) is defined as the ratio between workload (LOD) and capability (SPD) of this entity.

They define a balance threshold, denoted as q, from which they said that the standard deviation tends to zero and hence the group is balanced. For this purpose, they propose to define a threshold q [ [0–1]. They have given the following expression: If (r B q) Then the group is Balanced Else it is Imbalanced.

N. Rathore

For an imbalance state, they determine the overloaded elements (sources) and the under loaded ones (receivers), depending on the processing time of every element and relatively to processing time of the associated group.

An element can be balanced while being saturated. In particular case, it is not useful to start an intra group load balancing since its elements will remain overloaded. To measure saturation, they introduce another threshold called saturation threshold, noted as g. When the current workload of a group borders its capacity, it is obvious that it is useless to balance it since all belonging components are saturated.

In order to transfer tasks from overloaded elements to under loaded ones, they propose the following method:

• Evaluation of the total amount of workload ‘‘Supply’’, available on the receiver elements.

• Computation of the total amount of workload ‘‘Demand’’, required by source elements. • If the supply is much lower than the demand (supply is far to satisfying the request) it is

not recommended to start local load balancing. Here, authors introduce a third threshold, called expectation threshold denoted as q, to measure relative deviation between supply and demand. They have given the following expression:

If (Supply/Demand [ q) Then perform a Local load balancing Else perform a Higher level load balancing.

5.6 Algorithm The load balancing algorithm is presented in Load_Blnc_Algo() (Algorithm-1),

PELoad_Calc_Algo() (Algorithm-2), MLoad_Calc_Algo() (Algorithm-3), and RLoad_ Calc_Algo() (Algorithm-4) show the PE level load calculation, machine level load cal- culation, and resource load calculation, respectively. Algorithms Machine_Level_ LB_Algo() (Algorithm-5 and Algorithm-6), Res_Level_LB_Algo() (Algorithm-7 and Algorithm-8) and GB_Level_LB_Algo() (Algorithm-9 and Algorithm-10) show machine level load balancing, resource level load balancing and broker level load balancing, respectively. In the following algorithms, PE level load threshold, machine level load threshold, and the resource level load threshold is denoted by q, g, and q respectively.

Load parameter is given in Table 7 and explained in Sect. 5.5 .

Load balancing in any Machine will take place if there is any change occurs in current load situation. There is some particular activities will change the load condition in any system, the activities are as follows:

• Arrival of any new job and queuing of that job to any particular node. • Completion of execution of any job. • Arrival of any new resource. • Withdrawal of any existing resource.

Whenever any of these four activities happens then load information is collected and load balancing condition is checked. If load balancing condition is fulfilled then actual load balancing activity is performed. Our Load balancing algorithm will start if any of given four activity will happen. Proposed algorithms are described below:

Dynamic Threshold Based Load Balancing Algorithms

5.6.1 Algorithm-1: Load_Blnc_Algo () Our first proposed algorithm is Load_Blnc_Algo (), which runs at each machine at level-2.

Algorithm-1 is given below: Algorithm-1:Load_Blnc_Algo ( ) //This algorithm will run at each Machine (at level-2)

1 BEGIN 2 FORI = 1 to m DO

//For all Machines belonging to this resources. 3 Waiting for some activity happen

//Wait for Load change in any Machine 4 IF (activities_occurs) 5 CALL Machine_Level_LB_Algo ( )

//Call algo to find out the category of PEs. 6 ELSE GOTO Step 3 7 END IF

8 END FOR 9 END

Above algorithm executes for all the machines belonging to corresponding resources i.e. m as shown in step 2. Each machine m will wait for some activity happen. In step 4 if condition is true then next step-5 executes and calls Machine_Level_LB_Algo () otherwise back to previous step 3.

5.6.2 Algorithm-2: PELoad_Calc_Algo Load at each PE is calculated in Algorithm-2. This algorithm is called by Machine_Le-

vel_LB_Algo () algorithm. Algorithm is as follows: Algorithm-2: PELoad_Calc_Algo ( )

1 BEGIN 2 IF (g= 0) then 3 PELoad= 0.0 4 ELSE

5 FOR I = 1 to g DO //g is number of Gridlets in PE 6 (Curr_Load) PE = (Curr_Load) PE + (File_Size) I // File_size in byte 7 END FOR 8 END IF 9 PELoad = (Curr_Load) PE / rating. 10 RETURN PELoad. 11 END

In above algorithm, step 4 to step 8 will check for number of Gridlets, if it is zero then set PELoad zero otherwise loop given in step 5 will execute and iterate for g times to estimate current load at PE. Current load is calculated by summing file size of gridlets. Unit of file

N. Rathore

size is in byte and this is provided by user as input. In step 9 PELoad is calculate by dividing current load of PE to its actual capacity i.e. rating (here actual capacity means CPU speed in terms of MIPS rating). At last this algorithm returns the PELoad in step 10.

5.6.3 Algorithm-3: MLoad_Calc_Algo () Load at each Machine is calculated by using this algorithm. Machine load (MLoad) can be

measured by averaging the capacity of all available Processing Element that is attached to this machine. Algorithm is given below:

Algorithm-3 MLoad_Calc_Algo ( ) 1 BEGIN 2 FORI = 1 to p Do

//For all PEs belonging to this machine 3 CALLPELoad_Calc_Algo ( ).

// To calculate PELoad which is the work load of PE 4 Add PELoad to PEsLoad.

5 END FOR 6 MLoad = PEsLoad/ p

// Calculate MLoad = PEsLoad/ number_of_PEs 7 RETURNMLoad. 8 END

In above algorithm, step 2 executes loop p times (where p is number of PEs) and for each iteration it calls procedure PELoad_Calc_Algo () in step-3 (used for calculatePEsLoad). In step-4 add returned PELoad (return by algorithm PELoad_Calc_Algo ()) to PEsLoad so that summation of load of all the PEs connected to machine can estimated. Step-6 calculates MLoad by dividing PEsLoad to the total number of PEs (i.e. p). At the last step algorithm returns the PELoad. This algorithm is called by Res_Level_LB_Algo () algorithm.

5.6.4 Algorithm-4: RLoad_Calc_Algo () RLoad_Calc_Algo () is used to estimate Load at Level-1 i.e. resource level. Resource load

(RLoad) is measure by averaging the capacity of all available Machines from their cor- responding resource for calculating the load. Algorithm is given below:

Algorithm-4: RLoad_Calc_Algo ( )

1 BEGIN

2 FOR I= 1 to m Do //All Machine belonging to this Resource

3 CALL MLoad_Calc_Algo ( ).

4 Add MLoad to MsLoad.

5 END FOR

6 RLoad=MsLoad/ m //RLoad=MsLoad/ number_of_Machines.

7 RETURN RLoad.

8 END

Dynamic Threshold Based Load Balancing Algorithms

In above algorithm, step-2 executes m times and, for each iteration it calls MLoad_Calc_Algo() algorithm. MLoad returned by MLoad_Calc_Algo() is added to MsLoad in step-4. Step-6 calculates RLoad by MsLoad divided by total number of machines (i.e. m). This algorithm is called by GB_Level_LB_Algo() algorithm.

5.6.5 Algorithm-5: Machine Level Algorithm Machine_Level_LB_Algo () This algorithm is used to balance the load between attached PEs. This algorithm has been

run on each machine. If there is any load changes occur then this algorithm executes. Four types of list have been used to store ID of PE’s.

OPEList

It holds PE’s ID that has overloaded Gridlets. Its size is OP.

LPEList

It holds PE’s ID that has lightly loaded Gridlets. Its size is LP. ULPEList It holds PE’s ID that has under lightly loaded Gridlets. Its size is UP. NPEList

It holds PE’s ID that has normal Gridlets. Its size is NP. This algorithm is divided into two parts Machine_Level_LB_Algo () and

Machine_Level_LB_Algo2 (), are described as follows:

Algorithm-5: Machine_Level_LB_Algo ( ) 1 BEGIN 2 Create OPEList

//Which is the PE with overloaded Gridlets (size is OP). 3 Create ULPEList

//Which is the PE with under lightly loaded Gridlets (size is UP). 4 Create LPEList

//Which is the PE with lightly loaded Gridlets (size is LP). 5 Create NPEList

//Which is the PE with normal task (size is NP). 6 FOR I = 1 to p Do

//All PEs belonging to this Machine (p:no_of_PE 7 CALL PELoad_Calc_Algo ( )

//Calculate PELoad (work load of PEs) 8 IF (PELoad == ∂)

//Check PELoad with threshold value ∂ 9 Add this PE to NPEList. 10 ELSE IF (PELoad> ∂) 11 Add this PE to OPEList. 12 ELSE IF (PELoad< 0.5 * ∂)

13 Add this PE to ULPEList. 14 ELSE Add this PE to LPEList. 15 END IF. 16 END IF. 17 END IF. 18 END FOR 19 Sort OPEList in descending order of loads. 20 Sort ULPEList in ascending order of loads. 21 Sort LPEList in ascending order of loads. 22 CALL Machine_Level_LB_Algo2 ( ). 23 END

In above algorithm, first five steps create OPEList, LPEList, ULPEList, and NPEList to store overloaded, lightly loaded, under-lightly loaded and normal PEs ID and their load respectively. Step-6 to step-18 executes to divide the PEs into four defined category. Step-6

N. Rathore

ensures that this work for all the PEs (loop will iterate p times). For each iteration of loop it calculate PELoad by calling procedure PELoad_Calc_Algo () as in step-7. Then it list out the PEs in different categories according to their workload by comparing with threshold value q. There is condition in step-8 which checks PELoad is equal to threshold value q. If this condition is true then in step-9 this algorithm will add this PE to NPEList otherwise again check for PELoad is greater than value q (in Step-10), if true then add this PE to OPEList. In step-12 it perform check for PELoad which is greater than value 0.5 times of q or not, if this condition is true then add this PE to LPEList otherwise add PE to ULPEList as shown in step-13. In step-19 to step-21 sort OPEList, LPEList and ULPEList in ascending order, OPEList is sorted in descending order so that PE that is having greater loads among all PEs will select 1st for processing. In step-22 algorithm Machine_Level_LB_Algo2 () is called. This algorithm is called by Algorithm-1.

5.6.6 Algorithm-6: Machine_Level_LB_Algo2 () This algorithm will perform actual gridlet migration by checking overloaded PE (from which

Gridlet will select for migration) and suitable under loaded PE. Algorithm is as follows: In above algorithm, loop given in step-2 will iterate OP times (OP is size of OPEList) to choose all PE one by one from overloaded PE list (OPEList). Loop given in step-3 will iterate g times (g is number of Gridlets) to choose Gridlet g, from this selected PE. Next task is to find under loaded processing element K which is suitable to migrate selected Gridlet. To achieve this, algorithm will choose 1st processing element D from LPEList, as shown in step-4 variable D holds address of 1st elements of LPEList. Step-5 will check condition that sum of load of processing element D and load of Gridlet L (which is selected for migration purpose from overloaded PE) is less than 0.5 times of threshold value q (which indicates that 1st PE selected from LPEList is suitable for migration of Gridlet L). If this condition is true then in step-6 executes and Gridlet L will migrated from processing element K to processing element D, otherwise look into ULPEList. Step-

8 set value of variable E to address of 1st element of ULPEList. Step-9 checks condition that sum of load of processing element E and load of Gridlet L is less than 0.5 times of threshold value q (which indicates that 1st PE from ULPEList is suitable for migration purpose of Gridlet L). If this condition is true then step-10 will executes and Gridlet L will migrated from processing element K to processing element E. Otherwise Gridlet will executes in its originator PE i.e. K as in step-11. In step-18 this algorithm is called MLoad_Calc_Algo(). By using step-19, check whether this machine is overloaded, by checking the condition that division of size of overloaded PE list by total number of PEs is greater than threshold value g define for resource level. If condition is true then this algorithm will trigger its Grid Resources that the machine is overloaded by calling Algorithm-7.

5.6.7 Resource level algorithm Algorithm-7: Res_Level_LB_Algo () This algorithm is used to balance the load between attached Machines. This algorithm has

been run on each resource (Level-1). In this algorithm four types of list have been used to store machine’s ID.

Dynamic Threshold Based Load Balancing Algorithms

Algorithm-6: Machine_Level_LB_Algo ( )

1 BEGIN

2 WHILE (K < = OP) Do

//Pick one PE from OPEList

3 WHILE (L < = g) Do

//Pick one Gridlets from selected PE K

4 D = LPEList[0]

//Pick 1 st PE from LPEList

5 IF ((PELoad) D + (TaskLoad) L < 0.5 * ∂) then

6 Shift Gridlet L from (PE) K to (PE) D .

7 ELSE

8 E = ULPEList [0]

// Pick 1 st PE from ULPEList

9 IF ((PELoad) E +(TaskLoad) L <0.5 * ∂) then

10 Shift Gridlets L from (PE) K to (PE) E .

11 ELSE task L will be executed to its originator i.e. K

12 MOVE L to next gridlet of list.

//Look for the next task

13 END IF.

14 END IF.

15 END WHILE.

//End of task selection

16 MOVE OPEList to next node of list.

//Move to next node of OPElist

17 END WHILE.

//End of search in OPElist

18 Call MLoad_Calc_Algo( )

19 IF ((OP / p ) >= η) then

//(OPEListSize / Number_of_PEs)

20 CALL Res_Level_LB_Algo ( ).

//Machine is overloaded

21 END IF

22 END

OMList

It holds machine’s ID that has overloaded machine. Its size is OM.

LMList It holds machine’s ID that has lightly loaded machine. Its size is LM. ULMList It holds machine’s ID that has under lightly loaded machine. Its size is UM. NMList

It holds machine’s ID that has normal machine. Its size is NM. This algorithm will trigger when overloaded situation occurs in any Machine (or when

any machine failed to handle load in PEs). This situation will arise when all PEs con- nected to Machines will categorize under heavily loaded PE, and no lightly loaded PE is available in which the gridlet of heavily loaded PEscan migrate. This algorithm is divided into two parts Res_Level_LB_Algo () and Res_Level_LB_Algo2 (), are describing as follows:

N. Rathore

Algorithm-7: Res_Level_LB_Algo ( ) 1 BEGIN 2 Create OMList which is the Machine with overloaded PEs.

3 Create LMList which is the Machine with lightly loaded PEs. 4 Create ULMList which is the Machine with under lightly loaded PEs. 5 Create NMList which is the Machine with normal PEs. 6 FOR J = 1 to m Do

//All Machines belonging to this Resource do. 7 CALL MLoad_Calc_Algo ( )

//To get MLoad which is the work load of Machine. 8 IF (MLoad == η ) 9 Add this Machine to NMList. 10 ELSE IF (MLoad> η ) 11 Add this Machine to OMList. 12 ELSE IF (MLoad< = 0.5 * η) 13 Add this Machine to ULMList. 14 ELSE Add this Machine to LMList. 15 END IF. 16 END IF.

17 END IF. 18 END FOR

19 Sort OMList in descending order of loads. 20 Sort ULMList in ascending order of loads. 21 Sort LMList in ascending order of loads. 22 CALL Res_Level_LB_Algo2 ( ). 23 END

Above algorithm firstly creates OMList, LMList, ULMList, and NMList in step-1–5, to store overloaded, lightly loaded, under-lightly loaded and normal Machines ID respec- tively. Step-6 to step-18 executes to divide the Machines into four defined category. Step-6 ensures that this will work for all the Machines (loop will iterate m times). For each iteration of loop it will calculate MLoad by calling procedure MLoad_Calc_Algo () as in step-7. Then it list out the Machines in different categories according to their workload by comparing with threshold value g. There is condition in step-8 which will check MLoad is equal to threshold value g. If this condition is true then in step-9 this algorithm will add this Machine to NMList otherwise again check for MLoad is greater than value g (in Step-10), if true then add this Machines to OMList. In step-12 it perform check for MLoad which is greater than value 0.5 times of g or not, if this condition is true then add this Machine to LMList otherwise add Machine to ULMList as shown in step-13. In step-19 to step-21 sort LMList and ULMList in ascending order, OMList will sort in descending order so that Machine that is having greater loads among all Machines will select 1st for processing. In step-23 this algorithm will call Res_Level_LB_Algo2 ().

Dynamic Threshold Based Load Balancing Algorithms

5.6.8 Algorithm-8: Res_Level_LB_Algo2 () Algorithm-8: Res_Level_LB_Algo2 ( )

1 BEGIN 2 WHILE (K < = OM) Do

//Pick one Machine from OMList 3 WHILE (L < = g) Do

//Pick one Gridlet from selected Machine K 4 D = LMList[0]

//Pick 1 st Machine from LMList 5 IF ((MLoad) D + (TaskLoad) L < 0.5 * η ) then

6 Shift Gridlet L from (Machine) K to (Machine) D .

7 ELSE 8 E = ULMList [0]

// Pick 1 st Machine from ULMList 9 IF ((MLoad) E +(TaskLoad) L <0.5 * η ) then

10 Shift Gridlets L from (Machine) K to (Machine) E .

11 ELSE gridlet L will be executed to its originator i.e. K 12 MOVE L to next gridlet of list.

//Look for the next gridlet 13 END IF. 14 END IF. 15 END WHILE.

//End of task selection 16 MOVE OMList to next Machine of list.

//Move to next node of OMList 17 END WHILE.

//End of search in OMList 18 Call RLoad_Calc_Algo( ) 19 IF ((OM / m ) >= ρ) then

//(OMListSize / Number_of_Machines) 20 CALL GB_Level_LB_Algo ( ).

//Resource is overloaded 21 END IF 22 END

In above algorithm, loop given in step-2 will iterate OM times (OM is size of OMList) to choose all Machines one by one from overloaded Machine list (OMList). Loop given in step-3 will iterate g times (g is number of Gridlets) to choose Gridlet g, from this selected Machine. Next task is to find under loaded Machine K which is suitable to migrate selected Gridlet. To achieve this, algorithm will choose 1st Machine D from LMList, as shown in step-4 variable D holds address of 1st elements of LMList. Step-5 will check condition that sum of load of Machine D and load of Gridlet L (which is selected for migration purpose

N. Rathore