A system includes one or more memory devices storing instructions, and one or more processors configured to execute the instructions to perform steps of a method for processing a large file. The system may receive record data comprising a plurality of records having an identification value in a common field having a data format. The system may determine a plurality of focus values based on the data format and create a plurality of virtual processing units based on the plurality of focus values. Each of the plurality of virtual processing units may process a sub-group of the plurality of records that corresponds to the focus value associated with the respective virtual processing unit.