||Recent advances in network virtualization and pro- grammability enable innovative service models such as Service Chaining (SC), where flows can be steered through a pre-defined sequence of service functions, each deployed at possibly different cloud locations. A key aspect ultimately dictating the performance and efficiency of a SC is how to instantiate it onto the physical in- frastructure. While existing algorithms based on SC Embedding (SCE) have been effective in addressing this problem when SCs are assumed to only consume computation and communication resources, these algorithms are not designed to handle the increasing data-intensive nature of next-generation services. To fill this gap, in this paper, we formulate the data-intensive SCE problem with the goal of minimizing storage, computation, and communication resource costs subject to resource capacity and service chaining constraints. Using a randomized rounding technique that exploits a novel data-aware linear programming decomposition procedure, we develop a multi-criteria algorithm that provably achieves approximation guarantees while violating the resource capacities in a bounded way. Evaluation results demonstrate the near-optimal performance achieved by the proposed algorithm and provide relevant insights on the interplay between storage, computation, and communication resources.