Patent attributes
Described herein is an operator-based approach to representing dataflows. A dataflow is a set of one or more operations and one or more flows of data that are processed successively by the set of operations. A dataflow is described by a generic description in which operations in a dataflow are represented by operators. An operator defines a primitive operation (e.g. join, filter), specifying not only the type of operation, but the inputs and outputs, rules, and criteria that govern the operation. From the generic description, a code implementation is generated that may be completely executed on a source database system and target data warehouse, without need for an intermediate system to participate in the execution of the code implementation, such as a data movement engine.