Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-007-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-007-alpha.b-cdn.net
      • sl-yoda-v2-stream-007-beta.b-cdn.net
      • 1678031076.rsc.cdn77.org
      • 1932936657.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark

            Jul 25, 2023

            Speakers

            AP

            Alexander Pan

            Speaker · 0 followers

            JSC

            Jun Shern Chan

            Speaker · 0 followers

            AZ

            Andy Zou

            Speaker · 0 followers

            About

            Artificial agents have traditionally been trained to maximize reward, which may incentivize power-seeking and deception, analogous to how next-token prediction in language models (LMs) may incentivize toxicity. So do agents naturally learn to be Machiavellian? And how do we measure these behaviors in general-purpose models such as GPT-4? Towards answering these questions, we introduce Machiavelli, a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse sce…

            Organizer

            I2
            I2

            ICML 2023

            Account · 627 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Graphically Structured Diffusion Models
            05:02

            Graphically Structured Diffusion Models

            Christian Weilbach, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization
            05:13

            Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization

            Christopher Liao, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
            02:33

            Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

            Stone Tao, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
            07:13

            Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

            Jaeyoung Cha, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Bag of Tricks for Training Data Extraction from Language Models
            05:21

            Bag of Tricks for Training Data Extraction from Language Models

            Weichen Yu, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Density of Reachable States for Safe Autonomous Motion Planning
            33:35

            Density of Reachable States for Safe Autonomous Motion Planning

            Chuchu Fan

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023