Patent attributes
A method for edge profiling within a data graph measures the impact that a particular type of edge has on a data graph. A list of all edges contained within a single connected component within the graph is generated in order to search for bridges across connected subcomponents. The process is implemented as two MapReduce jobs on separate compute clusters. The first job is the edge profile job, which is implemented as a Map-only job. The second job reads the output of the first, and builds multiple in-memory data structures representing each connected component within the data graph. After the graph is created, it is traversed to find bridges.