Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-001-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-001-alpha.b-cdn.net
      • sl-yoda-v3-stream-001-beta.b-cdn.net
      • 1148202645.rsc.cdn77.org
      • 1784416251.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection

            Dez 15, 2023

            Sprecher:innen

            JY

            Jun Yan

            Sprecher:in · 0 Follower:innen

            VY

            Vikas Yadav

            Sprecher:in · 0 Follower:innen

            SL

            Shiyang Li

            Sprecher:in · 0 Follower:innen

            Über

            Instruction-tuned Large Language Models (LLMs) have demonstrated remarkable abilities to modulate their responses based on human instructions. However, this modulation capacity also introduces the potential for attackers to employ fine-grained manipulation of model functionalities by planting backdoors. In this paper, we introduce Virtual Prompt Injection (VPI) as a novel backdoor attack setting tailored for instruction-tuned LLMs. In a VPI attack, the backdoored model is expected to respond as…

            Organisator

            N2
            N2

            NeurIPS 2023

            Konto · 645 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image
            03:38

            GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image

            Mingjian Zhu, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Identifiable representation learning via sparse decoding
            29:14

            Identifiable representation learning via sparse decoding

            Gemma Moran

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Opening Remark
            04:37

            Opening Remark

            Ashish Vaswani, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Sequential Preference Ranking for Efficient Reinforcement Learning from Human Feedback
            04:53

            Sequential Preference Ranking for Efficient Reinforcement Learning from Human Feedback

            Minyoung Hwang, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            SHOT: Suppressing the Hessian along the Optimization Trajectory
            04:03

            SHOT: Suppressing the Hessian along the Optimization Trajectory

            JunHoo Lee, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Rehearsal Learning for Avoiding Undesired Future
            05:01

            Rehearsal Learning for Avoiding Undesired Future

            Tian Qin, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2023 folgen