24. července 2023
Řečník · 0 sledujících
Řečník · 0 sledujících
Backdoor (Trojan) attack is a common threat to deep neural networks, where samples from one or more source classes embedded with a backdoor trigger will be misclassified to adversarial target classes. Existing methods for detecting whether a classifier is backdoor attacked can only address attacks with a single adversarial target (e.g., all-to-one attack), while failing against more general X2X attacks with an arbitrary number of source classes each paired with an arbitrary target class. In this paper, we propose UMD, the first Unsupervised Model Detection method that effectively detects X2X backdoor attacks via a joint inference of the adversarial (source, target) class pairs. In particular, we first define a novel transferability statistic to measure and select a subset of putative backdoor class pairs based on our proposed clustering approach. Then, these selected class pairs are jointly assessed based on an aggregation of their reverse-engineered trigger size for detection inference, using a robust and unsupervised anomaly detector we proposed. We conduct comprehensive evaluations on three datasets and show that UMD outperforms SOTA detectors (e.g.) by 50% in model detection accuracy against general A2A attacks on CIFAR-10. We also conduct a series of ablation studies to show the strong detection performance of UMD against X2X attacks under various settings, as well as several adaptive backdoor attacks.Backdoor (Trojan) attack is a common threat to deep neural networks, where samples from one or more source classes embedded with a backdoor trigger will be misclassified to adversarial target classes. Existing methods for detecting whether a classifier is backdoor attacked can only address attacks with a single adversarial target (e.g., all-to-one attack), while failing against more general X2X attacks with an arbitrary number of source classes each paired with an arbitrary target class. In this…
Profesionální natáčení a streamování po celém světě.
Prezentace na podobné téma, kategorii nebo přednášejícího
Marin Biloš, …
Sangyun Lee, …