Patent attributes
Techniques described herein are directed to arbitrating between multiple potentially-responsive, automated-assistant capable electronic devices to determine which should respond to the user's utterance, and/or which should defer to other electronic device(s). In various implementations, a spoken utterance of a user may be detected at a microphone of a first electronic device, a spoken utterance provided by a user. Sound(s) emitted by additional electronic device(s) may also be detected at the microphone. Each of the sound(s) may encode a timestamp corresponding to detection of the spoken utterance at a respective electronic device. Timestamp(s) may be extracted from the sound(s) and compared to a local timestamp corresponding to detection of the spoken utterance at the first electronic device. Based on the comparison, the first electronic device may either invoke an automated assistant locally or defer to one of the additional electronic devices.