Patent attributes
In one embodiment, a device obtains telemetry data for a plurality of encrypted traffic flows observed in a network. The device clusters the flows into observed flow clusters, based on one or more flow-level features of the obtained telemetry data, as well as malware-related traffic telemetry data into malware-related flow clusters. The observed and malware-related telemetry data are indicative of sequence of packet lengths and times (SPLT) information for the traffic flows. The device samples sets of flows from the observed and malware-related flow clusters, with each set including at least one flow from an observed flow cluster and at least one flow from a malware-related flow cluster. The device trains a deep learning neural network to determine whether a particular encrypted traffic flow is malware-related, by using the SPLT information for the sampled sets of traffic flows as input to an input layer of neurons of the deep network.