CLOSER 2014 Paper Accepted

CLOSER 2014

Our recent submission to CLOSER 2014 has been accepted for publication. In this paper we develop shadow computing for the could computing environment. We show that using shadow replication as a fault tolerance scheme for map-reduce applications can both maximize profit and reduce energy consumption.

As the demand for cloud computing continues to increase, cloud service providers face the daunting challenge to meet the negotiated SLA agreement, in terms of reliability and timely performance, while achieving cost-effectiveness. This challenge is increasingly compounded by the increasing likelihood of failure in large-scale clouds and the rising cost of energy consumption. This paper proposes Shadow Replication, a novel profit-maximization resiliency model, which seamlessly addresses failure at scale, while minimizing energy consumption. The basic tenet of the model is to associate a suite of shadow processes to execute concurrently with the main process, but initially at a much reduced execution speed, to overcome failures as they occur. Two computationally-feasible schemes are proposed to achieve shadow replication. A performance evaluation framework is developed to analyze these schemes and compare their performance to traditional replication-based fault tolerance methods, focusing on the inherent tradeoff between fault tolerance, the specified SLA and profit maximization. The results show Shadow Replication leads to significant energy reduction, and is better suited for compute-intensive execution models, where up to 30% more profit increase can be achieved.