Unicorn: Unified Resource Orchestration for Multi-Domain, Geo-Distributed Data Analytics

Abstract Data-intensive analytics is entering the era of multi-organizational, geographically-distributed, collaborative computing, where different organizations contribute various resources, e.g., sensing, computation, storage and networking resources, to collaboratively collect, share and analyze extremely large amounts of data. This new paradigm calls for a framework to manage a large set of distributively owned heterogeneous resources, with the fundamental objective of efficient resource utilization, following the autonomy and privacy of resource owners. In this paper, we design Unicorn, the first unified framework that accomplishes this goal. The foundation of Unicorn is RSDP, an autonomous, privacy-preserving resource discovery and representation system to provide accurate resource availability information. Its core is a novel abstraction called resource vector abstraction which describes the resource availability in a set of linear constraints. In addition, Unicorn also provides a series of advanced solutions to support automatic, efficient management of resource dynamics on both supply and demand sides, including an automatic workflow transformer, an intelligent resource demand estimator and an efficient, scalable multi-resource orchestrator. Being the first unified framework for this new paradigm, Unicorn plays a fundamental role in next-generation data-intensive collaborative computing systems.
Authors
  • Qiao Xiang (Yale)
  • Shenshen Chen
  • Kai Gao
  • Harvey Newman
  • Ian Taylor (Cardiff)
  • Jingxuan Zhang
  • Richard Yang (Yale)
Date Aug-2017
Venue International Workshop on Distributed Analytics Infrastructure and algorithms for multi organization federations
Variants