A method comprises extracting first task data from a first data source corresponding to a first application and second task data from a second data source corresponding to a second application, and comparing the first task data to the second task data using one or more natural language processing techniques. In the method, one or more matching tasks between the first task data and the second task data are identified based at least in part on the comparing. Code of at least one of the first application and the second application is analyzed to determine whether the code of at least one of the first application and the second application implements the one or more matching tasks.