Cosmos is a large-scale data processing system used by thousands at Microsoft to process exabytes of data across clusters of over 50,000 servers. It provides a SQL-like language and allows teams to easily share and join data. This drives huge scalability requirements. The Apollo scheduler was developed to maximize cluster utilization while minimizing latency for heterogeneous workloads at cloud scale. Later, JetScope was created to support lower latency interactive queries through intermediate result streaming and gang scheduling while maintaining fault tolerance.