Shiqi Chen, Yiran Zhao, Jinghan Zhang, I-Chun Chern, Siyang Gao, Pengfei Liu, Junxian He · FELM: Benchmarking Factuality Evaluation of Large Language Models · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: FELM: Benchmarking Factuality Evaluation of Large Language Models

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-004-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-004-alpha.b-cdn.net
sl-yoda-v2-stream-004-beta.b-cdn.net
1685195716.rsc.cdn77.org
1239898752.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

FELM: Benchmarking Factuality Evaluation of Large Language Models

FELM: Benchmarking Factuality Evaluation of Large Language Models

Dec 10, 2023

Speakers

Shiqi Chen

Speaker · 0 followers

Yiran Zhao

Speaker · 0 followers

Jinghan Zhang

Speaker · 0 followers

About

Assessing factuality of text generated by large language models (LLMs) is an emerging yet crucial research area, aimed at alerting users to potential errors and guiding the development of more reliable LLMs. Nonetheless, the evaluators assessing factuality necessitate suitable evaluation themselves to gauge progress and foster advancements. This direction remains under-explored, resulting in substantial impediments to the progress of factuality evaluators. To mitigate this issue, we introduce a…

Organizer

NeurIPS 2023

Account · 645 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Agnostically Learning Single-Index Models using Omnipredictors

04:43

Agnostically Learning Single-Index Models using Omnipredictors

Watch later

Favorite

Aravind Gollakota, …

NeurIPS 2023 16 months ago

Loss Dynamics of Temporal Difference Reinforcement Learning

04:27

Loss Dynamics of Temporal Difference Reinforcement Learning

Watch later

Favorite

Blake Bordelon, …

NeurIPS 2023 16 months ago

Norm-guided latent space exploration for text-to-image generation

04:27

Norm-guided latent space exploration for text-to-image generation

Watch later

Favorite

Dvir Samuel, …

NeurIPS 2023 16 months ago

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes

04:31

The Drunkard’s Odometry: Estimating Camera Motion in Deforming Scenes

Watch later

Favorite

David Recasens, …

NeurIPS 2023 16 months ago

On the Role of Randomization in Adversarially Robust Classification

04:30

On the Role of Randomization in Adversarially Robust Classification

Watch later

Favorite

Lucas Gnecco Heredia, …

NeurIPS 2023 16 months ago

Exploring Loss Functions for Time-based Training Strategy in Spiking Neural Networks

05:12

Exploring Loss Functions for Time-based Training Strategy in Spiking Neural Networks

Watch later

Favorite

NeurIPS 2023 16 months ago