Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction

CVPR 2024
Jianping Jiang† 1,2,3, Xinyu Zhou† 4, Bingxuan Wang1,2,3, Xiaoming Deng‡ 5,6, Chao Xu4, Boxin Shi‡ 1,2,3,
1NKLMIP, School of Computer Science, Peking University 2NERCVT, School of Computer Science, Peking University 3AIIC, School of Computer Science, Peking University 4NKLGAI, School of Intelligence Science and Technology, Peking University 5Institute of Software, Chinese Academy of Sciences 6University of Chinese Academy of Sciences

Abstract

Reliable hand mesh reconstruction (HMR) from commonly-used color and depth sensors is challenging especially under scenarios with varied illuminations and fast motions. Event camera is a highly promising alternative for its high dynamic range and dense temporal resolution properties, but it lacks salient texture appearance for hand mesh reconstruction.

In this paper, we propose EvRGBHand -- the first approach for 3D hand mesh reconstruction with an event camera and an RGB camera compensating for each other. By fusing two modalities of data across time, space, and information dimensions, EvRGBHand can tackle overexposure and motion blur issues in RGB-based HMR and foreground scarcity as well as background overflow issues in event-based HMR. We further propose EvRGBDegrader, which allows our model to generalize effectively in challenging scenes, even when trained solely on standard scenes, thus reducing data acquisition costs. Experiments on real-world data demonstrate that EvRGBHand can effectively solve the challenging issues when using either type of camera alone via retaining the merits of both, and shows the potential of generalization to outdoor scenes and another type of event camera.

Video


--> -->

BibTeX

@article{Jiang2024EvRGBHand,
  author    = {Jiang, Jianping and Zhou, Xinyu and Wang, Bingxuan and Deng, Xiaoming and Xu, Chao and Shi, Boxin},
  title     = {Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction},
  journal   = {CVPR},
  year      = {2024},
}