Jul 24, 2023
Řečník · 0 sledujících
Řečník · 0 sledujících
Řečník · 0 sledujících
Řečník · 1 sledující
Řečník · 1 sledující
Řečník · 1 sledující
Řečník · 0 sledujících
In a backdoor attack, an adversary adds maliciously constructed (“backdoor”) examples into a training set to make the resulting modelvulnerable to manipulation. Defending against such attacks—e.g., by finding and removing the backdoor examples—typically involves viewing these examples as outliers and using techniques from robust statistics to detect and remove them.In this work, we present a new perspective on this task, that is, without structural information on the training data distribution,backdoor attacks are indistinguishable from naturally-occuring features in the data—and thus impossible to “detect” in a general sense.Then, to circumvent this impossibility, we assume that a backdoor attack corresponds to the strongest feature in the training data.Under this assumption—which we make formal—we develop a new framework for detecting backdoor attacks. Our framework naturally gives rise to a detection algorithm that comes with theoretical guarantees, and is effective in practice.In a backdoor attack, an adversary adds maliciously constructed (“backdoor”) examples into a training set to make the resulting modelvulnerable to manipulation. Defending against such attacks—e.g., by finding and removing the backdoor examples—typically involves viewing these examples as outliers and using techniques from robust statistics to detect and remove them.In this work, we present a new perspective on this task, that is, without structural information on the training data distribution,b…
Professionelle Aufzeichnung und Livestreaming – weltweit.
Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind
Harrison Zhu, …
Deep Pandey, …
Dibya Ghosh, …
Jerome Baum, …