Methods and systems for detecting and responding to an intrusion in a computer network include generating an adversarial training data set that includes original samples and adversarial samples, by perturbing one or more of the original samples with an integrated gradient attack to generate the adversarial samples. The original and adversarial samples are encoded to generate respective original and adversarial graph representations, based on node neighborhood aggregation. A graph-based neural network is trained to detect anomalous activity in a computer network, using the adversarial training data set. A security action is performed responsive to the detected anomalous activity.