Patent attributes
Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.