Persistent connections between multiple client devices and multiple back-end service components are managed using a consistent hashing-based approach to route distribution. A load balancer distributes the connections across multiple gateway servers. Each connection is associated with a device having an identifier, which can be hashed using a selected hashing algorithm. The gateway servers are assigned values over a hashing range. When a connection is established for a device, the hash value for that device can be mapped to a corresponding gateway server. The primary gateway server establishing the connection can store information (e.g., the port or interface) for the connection, and can send identifying information to the corresponding gateway server determined by the hash value. When a backend service wants to locate the connection, the service hashes the device identifier to determine the corresponding gateway server, which returns the identity of the primary gateway server hosting the connection.