Automating Scalability

15.2.5 Automating Scalability

Automatic scalability at the application level can be implemented in several manners. The two most significant ones are:

1. Users provide a set of rules in a well-defined language (e.g. see Galán et al. ( 2009 )) that feeds an “application controller” acting on behalf of the user and in charge of enforcing the specified scalability rules.

2. Design and implement application-specific algorithms or statistical methods for the controller to know when the application should scale without having to resort to users direct actions in an expert system.

As for the first approach above, a very relevant example will be given below (see Section 15.3.2 ). Suffice to say here that rule-based systems provide meaningful descriptions on the appropriate conditions to scale and datamining techniques can be employed to extract relevant rules and help users (service providers) to gain runtime knowledge on their application performance and optimization techniques. On the other hand, algorithmic techniques (by this, we also mean statistical tools, neural

15 Service Scalability Over the Cloud 371

Fig. 15.5 Control loop for a web server process (Abdelzaher et al., 2002 )

networks, or traditional control theory techniques) provide a degree of control for real-time response that cannot be attained by rule-based systems yet. This section is focused on some of these systems’ features.

Figure 15.5 describes a QoS (Quality-of-Service) management solution for a Web application based in Control-Theory (Abdelzaher, Shin, & Bhatti, 2002 ). Extending these concepts to an application in an IaaS Cloud, the control loop that tries to avoid system overloads and meet the individual response time and throughput guarantees would be:

• Service Application : the service application to be executed in one or more virtual machines in the Cloud. • Monitor : it provides feedback about resources utilization based on available measures such as CPU, disk memory and network bandwidth. • Controller : given the difference between the desirable QoS and resources utiliza- tion (as measured by the monitor component), the controller has to decide on the corrective actions to meet the QoS target. Thus, control theory offers analytic techniques for closing the loop such us modelling the system as a gain function that maximizes the resource utilization between acceptable margins.

• Actuator : it translates abstract controller outputs into concrete actions taken by the Cloud middleware to scale up/down the service components to change its load.

Here, we highlight an open challenge for dynamic scalability in the cloud: finding

a controller that could model a service deployed in a Cloud and make the system to scale as expected. In Ro¸su, Schwan, Yalamanchili, and Jha ( 1997 ) the authors proposed the use of Adaptative Resource Allocation (ARA) in real-time for embedded system plat- forms. ARA mechanisms promptly adjust resource allocation (vertical scalability) to changes in runtime variations of the application needs whenever there is a risk of not meeting its timing constrains avoiding “over-sizing” real-time systems to meet the worst-case scenario. ARA models an application as a set of interconnected software components which execution is driven by event streams:

• Resource Usage Model : describes an application’s expected computational and communication needs and the runtime variations of them.

372 J. Cáceres et al. • Adaption Model : acceptable configurations in terms of expected resource needs

and application-specific configuration overheads. This model could be generated by static and dynamic profiling tools that analyze

the source code and the application runtime under different workloads, respec- tively. The ARA controller, then, detects a risk of not meeting performance targets, so that it calculates an acceptable configuration and a more appropriate resource allocation. Resource needs estimations are done based on node characteristics (pro- cessor speed factor, communication links speed, communications overhead) and the application static (parallelism level, execution time, number of interchanged mes- sages and processor speed factor) and dynamic (execution factor, intra-component message exchange factors) resource usage models.

Similar real-time techniques based on mathematical application profiling and negative feedback loops, such us ARA, could be adapted to the Cloud scala- bility automation problem and represent future research challenges. Indeed, time series prediction is a very complex process and very specific for certain application domains (or even very specific applications).