This document presents a novel load balancing model for overloaded cloud partitions designed to improve efficiency and reduce response time in cloud computing environments. The proposed strategy utilizes two queues (priority and non-priority) to manage job allocation during overloaded conditions, coupled with a specialized scheduling algorithm. Performance analysis demonstrates that the new model enhances response times and fault tolerance compared to existing load balancing solutions.