Dec 6, 2021
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
The three-dimensional reconstruction of multiple interacting humans given a monocular image is crucial for the general task of scene understanding, as often capturing the subtleties of interaction is the very reason for taking a picture. Current 3D human reconstruction methods either treat each person independently, ignoring most of the context, or reconstruct people jointly, but cannot recover interactions correctly when people are in close proximity. In this work, we propose models that learn to reconstruct a variable number of people directly from monocular images. At the core of our methodology stands a novel transformer network that combines person detection responses and multiple learnable visual image features, each encoded as one token. We introduce novel self-collision and interpenetration-collision losses based on a mesh approximation computed by applying decimation operators and rely on self-supervised losses for flexibility and generalisation in-the-wild. We also incorporate self-contact and interaction-contact losses directly into the learning process. We report state-of-the-art quantitative results on common benchmarks even in cases where no 3D supervision is used. Additionally, qualitative visual results show that our reconstructions are plausible in terms of pose and shape and coherent for challenging images, collected in-the-wild, where people are often interacting.The three-dimensional reconstruction of multiple interacting humans given a monocular image is crucial for the general task of scene understanding, as often capturing the subtleties of interaction is the very reason for taking a picture. Current 3D human reconstruction methods either treat each person independently, ignoring most of the context, or reconstruct people jointly, but cannot recover interactions correctly when people are in close proximity. In this work, we propose models that learn…
Konto · 1,9k Follower:innen
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Arun Verma, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Nic Jedema, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%