||Data gathered from dense sensor networks is often highly correlated across collocated sensors. For example, in video surveillance networks, multiple cameras can observe the same object from multiple angles. Despite the spatial and temporal dependencies between video frames from different cameras, the deep learning algorithms used in todays video analytics problems treat all frames as independent inputs to image classifiers and object detectors. The outputs of these classifiers and detectors on multiple frames are then fused to extract information about the underlying sensor region. We present a cooperative learning framework that allows sensors to train deep learning systems on their own local data and compressed insights from neighboring sensors input data. This system fuses sensor data before classification to allow learning agents to more naturally handle correlated inputs and cooperate with neighboring sensors with minimal communication costs.