Patent attributes
One or more computer processors improve action recognition by removing inference introduced by visual appearances of objects within a received video segment. The one or more computer processors extract appearance information and structure information from a received video segment. The one or more computer processors calculate a factual inference (TE) for the received video segment utilizing the extracted appearance information and structure information. The one or more computer processors calculate a counterfactual debiasing inference (NDE) for the received video segment. The one or more computer processors calculate a total indirect effect (TIE) by subtracting the calculated counterfactual debiased inference from the calculated factual inference. The one or more computer processors action recognize the received video segment by selecting a classification result associated with a highest calculated TIE.