The automated collection of online data is enhanced by generating and saving a context between a document and a related named entity, as well as a credibility level of the online source. The context, credibility level, and quality and quantity of collected data are used to enhance the use of the collected data in automated decision-making. Both the quality and the quantity may be continuously updated and honed through machine learning. Three new algorithms—DUPES, CORRAL, and ONTO—have been introduced to support the above, improving current state-of-the-art engineering practice by sharpening the strategy for named-entity searching, for ensuring that topic modeling produces relevant topic tags, and for handling sentiment which may be NEGATIVE, POSITIVE, and NEUTRAL (which includes MISSING and INCONCLUSIVE).