Energy-Aware Speculative Scheduling
This thesis presents a comprehensive investigation into energy-efficient workflow scheduling in multi-cloud environments, specifically tailored for data-intensive applications. The execution of data intensive analysis workflows in a multi-cloud environment requires a large amount of input data, which is stored in multiple storage elements. The turnaround time taken by an individual analysis workflow running on a worker machine is mostly affected by the data reading time. These workflows frequently encounter bottlenecks due to task queuing and repeated accesses to similar type of input data, leading to inefficiencies in data retrieval. This inefficiency escalates operational costs for data centre operators and adversely impacts the environment by increasing carbon dioxide emission.
Minimizing the data reading time and energy consumption can improve the overall efficiency of the data analysis process with a reduced energy dissipation. To overcome this problem, I have used Energy Aware Speculative Scheduling to optimize the multi-cloud analysis workflows by intelligently streaming data before a task arrives for execution at the worker machine and by finding the sweet spot for the CPU execution. I propose an Event System (ES) which is an in-memory process responsible for proactively providing input data to the workflow processes. It prefetches the data from the storage elements to the memory of the worker machine, which executes the workflow. Using locality-aware scheduling and prefetching algorithms, it performs Speculative Scheduling on the basis of the evaluation of historic execution logs using the Bayesian Inference model. The ES learns about the incoming jobs ahead of time and makes use of intelligent data streaming to supply data to these jobs, thus reducing the overall scheduling and data access latencies and leading to significant improvements in the overall turnaround time. To optimize the energy efficiency of the whole analysis process, I present models for energy consumption employing Variational Autoencoders to generate similar environments. I have evaluated the proposed system using a number of large analysis workflows from High Energy Physics.
The results have shown that by using the Speculative Aware Scheduling technique with Bayesian inferencing for task prediction and proactive data prefetching, significant improvements (i.e. over 30%) can be achieved in the execution of data-intensive workflows in the cloud environment. Furthermore, in terms of Energy Aware modeling, our findings reveal that optimal operational configurations, such as a reduction of 15% (sweet spot) in operational configurations using the SAS scheduling technique, can decrease energy consumption by up to 10% with only a minimal increase in turnaround time 5%.
History
Supervisor(s)
Ashiq Anjum; Lu Liu; Latchezar BetevDate of award
2025-01-28Author affiliation
School of Computing and Mathematical SciencesAwarding institution
University of LeicesterQualification level
- Doctoral
Qualification name
- PhD