US Patent 11449777 Serverless inference execution across a heterogeneous fleet of devices

Systems and methods are described for providing for serverless inferences against a trained machine learning (ML) model. Rather than obtaining one or more dedicated devices to conduct inferences, users are enabled to create a task on a serverless system that, when invoked, passing input data to a trained ML model and provides a result. To satisfy varying user requirements for inference speed, the system includes a variety of hardware configurations. The system can efficiently allocate resources between different tasks by invoking the task on a particular hardware configuration that is selected based on a current availability of the selected hardware configuration to host an execution environment in which the task is implemented and an expected time to invoke the task on the individual hardware configuration. The system can therefore efficiently allocate resources among inferences using a variety of different ML models.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11449777 Serverless inference execution across a heterogeneous fleet of devices

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11449777 Serverless inference execution across a heterogeneous fleet of devices