A processor-implemented method for sound characterization is described. In one implementation, time-frequency transform of each of a plurality of sound signals from one or more sources, the sound signals being detected by a plurality of sensing devices, is derived. One or more single-source constant-time analysis zones based at least on correlation between the time-frequency transform signals from a pair of sensing devices are detected. At least one direction of arrival for each source in the detected single source analysis zones are detected. A histogram of the estimated directions of arrival is created and an estimate of a number of the sound sources and corresponding directions of arrival are generated based at least on the histogram.