US Patent 11645515 Automatically determining poisonous attacks on neural networks

Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes executing a set of analyses and integrating the results of the analyses into a determination as to whether a training data set is poisonous based on determining if resultant activation clusters are poisoned.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11645515 Automatically determining poisonous attacks on neural networks

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11645515 Automatically determining poisonous attacks on neural networks