Patent attributes
A machine learning network is implemented by executing a computer program of instructions on a machine learning accelerator (MLA) comprising a plurality of interconnected storage elements (SEs) and processing elements (PEs. The instructions are partitioned into blocks, which are retrieved from off-chip memory. The block includes a set of deterministic instructions to be executed by on-chip storage elements and/or processing elements according to a static schedule. The block also includes the number of non-deterministic instructions to be executed prior to executing the set of deterministic instructions in this block. These non-deterministic instructions may be instructions for storage elements to retrieve data from off-chip memory and are contained in one or more prior blocks. The execution of these non-deterministic instructions is counted, for example through the use of tokens. The set of deterministic instructions in the current block is not executed until the count reaches the number provided in the block.