||Deep learning models typically make inferences over transient features of the latent space, i.e., they learn data representations to make decisions based on the current state of the inputs over short periods of time. Such models would struggle with state-based events, or complex events, that are composed of simple events with complex spatial and temporal dependencies. In this paper, we propose DeepCEP, a framework that integrates the concepts of deep learning models with complex event processing engines to make inferences across distributed, multimodal information streams with complex spatial and temporal dependencies. DeepCEP utilizes deep learning to detect primitive events. A user can define a complex event to be detected as a particular sequence or pattern of primitive events as well as any other logical predicates that constrain the definition of such an event. The integration of human logic not only increases robustness and interpretability, but also greatly reduces the amount of training data required. Further, we demonstrate how the uncertainty of a model can be propagated throughout the complex event detection pipeline. Finally, we enumerate the future directions of research enabled by DeepCEP. In particular, we detail how an end-to-end training model for complex event processing with deep learning may be realized.