Prophet: Toward Fast, Error-Tolerant Model-Based Throughput Prediction for Reactive Flows in DC Networks

Abstract As modern network applications (e.g., large data analytics) become more distributed and can conduct application-layer traffic adaptation, they demand better network visibility to better orchestrate their data flows. As a result, the ability to predict the available bandwidth for a set of flows has become a fundamental requirement of today's networking systems. While there are previous studies addressing the case of non-reactive flows, the prediction for reactive flows, e.g., flows managed by TCP congestion control algorithms, still remains an open problem. In this paper, we take the first step to solving this problem in a data center network. To address both theoretical and practical challenges, we introduce a novel learning-based prediction system based on the NUM model, with two key techniques named fast factor learning (FFL) and efficient flow sampling. We adopt novel techniques to overcome practical concerns such as scalability, convergence and unknown system parameters. A system, Prophet, is proposed leveraging the emerging technologies of Software Defined Networking (SDN) to realize the model. Evaluations demonstrate that our solution achieves significant accuracy in a wide range of settings.
Authors
  • Jingxuan Zhang
  • Kai Gao
  • Richard Yang (Yale)
  • Jun Bi
Date Aug-2020
Venue IEEE/ACM Transactions on Networking, Early access [link]