Patent attributes
The technology includes methods, processes, and systems for virtualizing graphics processing unit (GPU) memory. Example embodiments of the technology include managing an amount of GPU memory used by one or more processes, such as Application Programming Interfaces (APIs), that directly or indirectly impact one or more other processes running on the same GPU. Managing and/or virtualizing the amount of GPU memory may ensure that an end user does not receive a GPU out-of-memory error because the API request is impacted by the processing of other API requests. A virtual machine with access to a GPU may be organized with one or more job slots that are configured to specify the number of processes that are able to run concurrently on a specific virtual machine. A process may be configured on each virtual machine running a software program or API and is used to schedule work based on GPU memory requirements.