Unicorn: Unified Resource Orchestration for Multi-Domain, Geo-Distributed Data Analytics

Abstract As the data volume increases exponentially over time, data-intensive analytics benefits substantially from multi organizational, geographically-distributed, collaborative computing, where different organizations contribute various yet scarce resources, e.g., computation, storage and networking resources, to collaboratively collect, share and analyze extremely large amounts of data. By analyzing the data analytics trace from the Compact Muon Solenoid (CMS) experiment, one of the largest scientific experiments in the world, and systematically examining the design of existing resource management systems for clusters, we show that the multi-domain, geo-distributed, resource-disaggregated nature of this new paradigm calls for a framework to manage a large set of distributively-owned, heterogeneous resources, with the objective of efficient resource utilization, following the autonomy and privacy of different domains, and that the fundamental challenge for designing such a framework is: how to accurately discover and represent resource availability of a large set of distributively-owned, heterogeneous resources across different domains with minimal information exposure from each domain? Existing resource management systems are designed for single domain clusters and cannot address this challenge. In this paper, we design Unicorn, the first unified resource orchestration framework for multi-domain, geo-distributed data analytics. In Unicorn, we encode the resource availability for each domain into resource state abstraction, a variant of the network view abstraction extended to accurately represent the availability of multiple resources with minimal information exposure using a set of linear inequalities. We then design a novel, efficient cross-domain query algorithm to discover and integrate the accurate, minimal resource availability information for a set of data analytics jobs across different domains. In addition, Unicorn also contains a global resource orchestrator that computes optimal resource allocation decisions for data analytics jobs. We discuss the implementation of Unicorn and present preliminary evaluation results to demonstrate the efficiency and efficacy of Unicorn. We will also give a full demonstration of the Unicorn system at SuperComputing 2017.
  • Qiao Xiang (Yale)
  • X. Tony Wang (Yale)
  • Jingxuan Zhang
  • Harvey Newman
  • Richard Yang (Yale)
  • Y. Jace Liu
Date Nov-2017
Venue Innovating the Network for Data Intensive Science