One embodiment provides a method including identifying all computing nodes and connections associated with the computing nodes in a data center. For each computing node, running processes are identified using natural language processing (NLP) by: extracting known process entities according to predetermined rules; extracting unknown process entities by: grouping process logs that share process entities and identifying hints in parameters and directory paths; receiving annotations to the hints to identify an application a process is running; and creating a new rule based on the annotations and propagating the new rule to other process logs. A visual representation of the computing nodes and the processes running on the computing nodes is generated.