Patent attributes
The present disclosure describes a system for dynamic transaction-persistent server load balancing. The disclosed system receives a client request associated with a new transaction. In response to receiving the client request, the system dynamically infers relative capacities of a plurality of servers coupled to the device in a network. In particular, the system maintains a set of variables corresponding to the servers. Each variable indicates a number of outstanding requests transmitted from the device to a respective server. The system infers relative server capacities and transmission latencies between the device and the servers based on a comparison of current values of the variables. The system identifies and selects a server associated with high capacity or low transmission latency between the device and the server relative to one or more other servers, and transmits an outstanding request corresponding to the client request from the new transaction to the identified server.