A method of filtering an original set of user-provided text responses across a network, including receiving, from one or more processors on one or more user computers, multiple user-provided text responses to a question, the multiple user-provided text responses forming the set, identifying text responses that are no-value text responses and removing the no-value text responses from the original set, removing text responses from the original set where the length of the response does not meet a threshold length, identifying text responses that are gibberish responses and removing gibberish responses from the original set, sending the remaining responses as a filtered set of responses to a machine learning system, the machine learning system to: perform clustering on the filtered set to identify one or more clusters of text responses that are similar to each other, identifying text responses outside the one or more clusters as noise responses, scoring the noise responses, and removing text responses having a score equal to or below a threshold score from the filtered set of responses to produce a final set, and outputting the final set of responses to an information gatherer.