The research proposes NoiseAttack, a novel backdoor attack for image classification that utilizes the power spectral density (PSD) of White Gaussian Noise (WGN) to achieve imperceptibility and robustness.
The WGN trigger is embedded globally across all training samples but designed to activate only on specific target samples. NoiseAttack achieves sample-specific multi-targeted misclassification, allowing the attacker to manipulate the model’s prediction towards multiple classes.
It aims to embed a concealed backdoor in deep neural networks by manipulating the training data by utilizing White Gaussian Noise (WGN) as a trigger, allowing for flexible target labels.
By applying WGN with specific standard deviations to input samples, the attacker can influence the model’s predictions to misclassify the victim class into desired target classes, which is designed to be imperceptible, ensuring that the model’s performance on clean inputs remains unaffected.
The NoiseAttack framework introduces a novel approach to backdoor attacks by utilizing white Gaussian noise as a trigger. The key innovation lies in the ability to control the strength and specificity of the trigger through the standard deviation of the noise, allowing for flexible and customizable attacks.
The proposed training objective minimizes the loss function for both clean and poisoned data, ensuring effective model poisoning. By progressively manipulating the standard deviations, NoiseAttack can generate a diverse range of attack results, making it a versatile and adaptable tool for adversaries.
It effectively targets multiple labels in deep neural networks, achieving high average attack success rates while maintaining high clean accuracy. The attack’s effectiveness is demonstrated across various datasets and models, with results showing low confusion rates and high accuracy excluding the victim class.
The study also explores the impact of training standard deviations, poisoning ratios, and the number of target labels on the attack’s performance. NoiseAttack emerges as a robust and stealthy backdoor attack, posing a significant threat to the security of deep learning systems.
The NoiseAttack, a novel sample-specific backdoor attack, outperforms existing state-of-the-art attacks in terms of attack success rate and clean accuracy.
It employs white Gaussian noise distributed across the entire image as a trigger, making it difficult to detect by defense mechanisms like GradCam, Neural Cleanse, and STRIP.
Experiments on CIFAR-10 and MS-COCO datasets demonstrate NoiseAttack’s effectiveness in both image classification and object detection models, highlighting its potential threat to the security of deep learning systems.
The authors propose a novel backdoor attack called NoiseAttack that exploits the power spectral density of White Gaussian Noise as a trigger, which is highly effective, targeting multiple samples simultaneously and achieving high attack success rates across various datasets and models.
The attack is also evasive and robust, bypassing existing detection and defense techniques by demonstrating the feasibility and ubiquity of NoiseAttack through theoretical analysis and extensive experiments.