Patent attributes
A technology is provided for call failure backoff in a computing service environment. An allowable call failure rate is defined for application programming interface (API) calls sent to one or more endpoints. Each endpoint may use a token bucket containing a plurality of tokens, wherein a single token is defined as being equal to one API call failure. A number of tokens in the token bucket are determined prior to executing an API call to the one or more endpoints. A health status of the one or more endpoints is identified according to the number of tokens in the token bucket. The API calls to the one or more endpoints having the determined number of tokens in the token bucket that are equal to zero or may be delayed for a predetermined backoff time period.