||Many emerging applications generate large amounts of data at local processing nodes over time. Such data can include the status of each individual machine in a distributed comput- ing system, measurements from sensors, etc. Due to resource constraints in the distributed system, it is often impossible to transmit all the data to a central location for processing. In ad- dition, it is often useful to forecast the future based on histor- ical observations for planning purposes, but it is usually too resource consuming to have one forecasting model running at each local node, which may also neglect spatial correlations across the nodes. In this paper, we study the problem of col- lecting and forecasting distributed time-series data, such as resource consumption levels and sensor measurements, in a scalable manner. The forecasting model is built using only a summary of the local observations. We propose an algorithm that allows each local node to decide when to transmit its most recent observation to the central node, so that the trans- mission frequency remains within a given constraint value. Based on the most recent observations received from local nodes, the central node summarizes the received data into a small number of clusters. Because the cluster partitioning can change over time, we also develop a method to capture the evolution of clusters and their centroids. As an effective way to reduce the amount of computation, time-series forecast- ing models are trained on the time-varying centroids of each cluster, to forecast the future observations of a group of local nodes. The effectiveness of our proposed approach is con- firmed by extensive experiments using real-world datasets.