A framework for proactive fault tolerance in cloud-IoT applications
MetadataShow full item record
Integrating Internet of Things (IoT) devices with the cloud has several benefits, including expanding local IoT resources and improving cloud-IoT application performance. Cloud computing can benefit from IoT devices and applications by extending its scope to include real-world surroundings. On the other hand, IoT can use the cloud’s unlimited computing and storage power. Modern cloud-based applications, including smart cities, home automation, and eHealth, require a highly scalable and available framework that enables computing, storage, and data analysis. Cloud computing cannot respond to the growing number of IoT devices due to its remote location, and cloud providers are struggling to meet the quality of service (QoS), such as low latency. Cloud applications have a high probability of failure as they operate in a large-scale environment, including physical and virtual machines. The Coronavirus pandemic (COVID-19) has tested cloud providers in many ways, none of which could have been predicted. Although the public cloud has proven remarkably resilient in overcoming an unprecedented stress test, there are remarkable exceptions to cloud failure problems that occurred in the first half of 2020. In this thesis, the main objective is to design and implement a cloud-IoT framework that has been developed utilizing proactive fault tolerance techniques to provide high reliability and availability for IoT applications. The framework aims to decrease the number of task failures and minimize the time and cost of using the cloud. This thesis also analyzes and characterize the behaviour of failed and finished tasks using publicly accessible traces. A design of highly reliable and available IoT applications has been proposed based on the development of Edge-Cloud architecture to support modern IoT applications. The evaluation results show a significant correlation between unsuccessful tasks and the resources requested. The results indicate that the proposed framework performance has improved, as well as the throughput efficiency increases by 55% after integrating the local resources with the cloud. The machine and deep learning-based failure prediction model can reduce the number of failed tasks for cloud-IoT applications. Moreover, the failure prediction model can predict failed tasks with a high rate of precision, recall, and F1-score.