Patent attributes
A method implemented by a server enables sharing of GPU resources by multiple clients. The server receives a request from a first client for GPU services. The request includes a first block of GPU code of an application executing on the first client. A first task corresponding to the first block of GPU code is enqueued in a task queue. The task queue includes a second task that corresponds to a second block of GPU code of an application executing on a second client. The server schedules a time for executing the first task using a GPU device that is assigned to the first client, and dispatches the first task to a GPU worker process to execute the first task at the scheduled time using the GPU device. The GPU device is shared, either temporally or spatially, by the first and second clients for executing the first and second tasks.