Patent attributes
A system configured to improve beamforming by using deep neural networks (DNNs). The system can use one trained DNN to focus on a first person speaking an utterance (e.g., target user) and one or more trained DNNs to focus on noise source(s) (e.g., wireless loudspeaker(s), a second person speaking, other localized sources of noise, or the like). The DNNs may generate time-frequency mask data that indicates individual frequency bands that correspond to the particular source detected by the DNN. Using this mask data, a beamformer can generate beamformed audio data that is specific to a source of noise. The system may perform noise cancellation to isolate first beamformed audio data associated with the target user by removing second beamformed audio data associated with noise source(s).