Patent attributes
A system and method for data stream processing. Two or more instances are connected as a topology, wherein at least one of the instances is a spout and at least one of the instances is a bolt. The topology is submitted to a scheduler, wherein the service scheduler receives resource offers from a cluster manager representing computing resources available on one or more of cluster nodes and determines resources to accept and computations to run on the accepted computing resources. The topology is scheduled as one or more jobs, wherein each job includes two or more containers, including a first container and a second container, the first container including a topology master and the second container including a stream manager and one or more stream processing system (SPS) instances, wherein each SPS instance represents one of the instances in the topology.