Computer Science and Engineering
Date of Award
Doctor of Philosophy (PhD)
Chair and Committee
Jonathan S Turner
Data centers are growing rapidly in size and have recently begun acquiring a new role as cloud hosting platforms, allowing outside developers to deploy their own applications on large scales. As a result, today's data centers are multi-tenant environments that host an increasingly diverse set of applications, many of which have very demanding networking requirements. This has prompted research into new data center architectures that offer increased capacity by using topologies that introduce multiple paths between servers. To achieve consistent network performance in these networks, traffic must be effectively load balanced among the available paths. In addition, some form of system-wide traffic regulation is necessary to provide performance guarantees to tenants.
To address these issues, this thesis introduces several software-based mechanisms that were inspired by techniques used to regulate traffic in the interconnects of scalable Internet routers. In particular, we borrow two key concepts that serve as the basis for our approach. First, we investigate packet-level routing techniques that are similar to those used to balance load effectively in routers. This work is novel in the data center context because most existing approaches route traffic at the level of flows to prevent their packets from arriving out-of-order. We show that routing at the packet-level allows for far more efficient use of the network's resources and we provide a novel resequencing scheme to deal with out-of-order arrivals.
Secondly, we introduce distributed scheduling as a means to engineer traffic in data centers. In routers, distributed scheduling controls the rates between ports on different line cards enabling traffic to move efficiently through the interconnect. We apply the same basic idea to schedule rates between servers in the data center. We show that scheduling can prevent congestion from occurring and can be used as a flexible mechanism to support network performance guarantees for tenants. In contrast to previous work, which relied on centralized controllers to schedule traffic, our approach is fully distributed and we provide a novel distributed algorithm to control rates. In addition, we introduce an optimization problem called backlog scheduling to study scheduling strategies that facilitate more efficient application execution.
Haitjema, Mart Albert, "Delivering Consistent Network Performance in Multi-tenant Data Centers" (2013). All Theses and Dissertations (ETDs). 1077.