Patent attributes
A system for reconstructing browser interaction data from session data having incomplete tracking data is described. The system comprises: a data ingestion engine for ingesting data from a plurality of different data sources including an on-line user-interaction tracking source which provides the session data relating to different users' interaction with a website, some of the session data including tracking identifiers, and a non-interaction tracking source for providing non-session data relating to user activity other than session data; a data store for storing the ingested data; a data cleansing engine for cleansing the ingested data, the data cleansing engine comprising: a data re-evaluation engine for evaluating the non-session data and recovering user identifiers within the non-session data; and a path view building engine for linking together session data from different user interaction sessions to form linked session data using the tracking identifiers within the session data; wherein the data re-evaluation engine is arranged to compare the recovered user identifiers from the non-session data with user identifiers associated with the session data, and to associate any unlinked session data not previously linked with the linked session data, with linked session data which has an association via the recovered user identifiers, and wherein the path view building engine is arranged to link together any unlinked session data with linked session data having an association via the recovered user identifiers.