Welcome to the 1st Human-Autonomous Vehicle Interaction Workshop !

The 1st International Workshop on Human-Autonomous Vehicle Interaction at WACV 2025 will provide a platform for researchers focused on the human aspect of autonomous vehicles. We aim to encourage discussions on innovative solutions and cross-disciplinary research. Specifically, the workshop topics will include (but are not limited to):

  • Human perception (face, hand, gaze and etc.) for autonomous vehicles.
  • Human-centric autonomous driving.
  • In-vehicle human interaction.
  • Driver assistance and monitoring systems.
  • Pedestrian detection, re-identification, and trajectory prediction.
  • Simulation and generation for autonomous vehicles.
  • Large Language Models (LLMs) for autonomous vehicles.
  • New datasets, benchmarks, and evaluation metrics for autonomous vehicles.
  • Analysis of drivers, passengers, pedestrians, and all individuals related to autonomous vehicles.
We will be hosting 2 invited speakers and will also be accepting the submission of full unpublished papers as done in previous versions of the workshop. These papers will be peer-reviewed via a double-blind process, and will be published in the official workshop proceedings and be presented at the workshop itself.


Call for Contributions


Full Workshop Papers

We invite authors to submit unpublished papers to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. Accepted papers will be published in the official WACV Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Submission CMT*: All contributions must be submitted (along with supplementary materials, if any) at this CMT link.

Author guidelines: 8-page, following WACV main conference WACV format

Templates: Overleaf template; .zip template.



Important Dates


Paper Submission Deadline 6 December, 2024 (23:59 Pacific time). Submission Now!
Papers Reviews Deadline 20 December, 2024
Notification to Authors 27 December, 2024
Camera-Ready Deadline 10 January, 2025
Workshop Day 1:00PM-5:00PM, 28 February, 2025


Workshop Schedule


Time in MST Start Time in your time zone*
Item
1:00pm - 1:10pm 28 Feb 2025 20:00:00 UTC Opening Remark
1:10pm - 1:50pm 28 Feb 2025 20:10:00 UTC Keynote Speaker Xiatian Zhu
1:50pm - 2:05pm 28 Feb 2025 20:50:00 UTC AAT-DA: Accident Anticipation Transformer with Driver Attention.
2:05pm - 2:20pm 28 Feb 2025 21:05:00 UTC Snapshot: Towards Application-centered Models for Pedestrian Trajectory Prediction in Urban Traffic Environments.
2:20pm - 2:35pm 28 Feb 2025 21:20:00 UTC What's Happening- A Human-centered Multimodal Interpreter Explaining the Actions of Autonomous Vehicles.
2:35pm - 2:50pm 28 Feb 2025 21:35:00 UTC Deep Learning-based rPPG Models towards Automotive Applications: A Benchmark Study.
3:00pm - 3:45pm 28 Feb 2025 22:00:00 UTC Poster Session.
3:45pm - 4:25pm 28 Feb 2025 22:45:00 UTC Keynote Speaker Jingbo Wang.
4:25pm - 4:35pm 28 Feb 2025 23:25:00 UTC Award & Closing Remark.
* This time is calculated to be in your computer's reported time zone.
For example, those in Los Angeles may see UTC-7,
while those in Beijing may see UTC+7.

Please note that there may be differences to your actual time zone.


Invited Keynote Speakers

Xiatian Zhu
Surrey University, U.K.

Safer Autonomous Systems with Predictive Intelligence & Generative Simulation


Abstract

Safety in autonomous driving relies on accurately predicting the motion of surrounding agents and generating realistic driving environments for robust simulation and testing. In this talk, I present two advancements that enhance these capabilities. First, I introduce RealMotion, a motion forecasting framework designed for continuous driving. Unlike traditional models that process scenes independently, RealMotion captures evolving situational and contextual relationships across time, improving forecasting accuracy and real-world efficiency for safer decision-making. Next, I explore DriveX, a driving scene synthesis approach that enables free-form trajectory simulation. While existing methods struggle with novel trajectories due to limited video perspectives, DriveX leverages video generative priors to optimize a 3D scene model across diverse paths, allowing for scalable, high-fidelity simulations that support safer and more adaptable autonomous systems. By bridging predictive intelligence with generative simulation, this talk highlights new pathways toward safer, more reliable autonomous driving.

Biography (click to expand/collapse)

Dr. Xiatian Zhu is a Senior Lecturer at the Surrey Institute of People-Centred AI and the Centre for Vision, Speech, and Signal Processing (CVSSP) at the University of Surrey in Guildford, UK. He leads the Universal Perception (UP) lab, which focuses on advancing multimodal generative AI for real-world applications and business. Dr. Zhu earned his PhD from Queen Mary University of London and received the 2016 Sullivan Doctoral Thesis Prize from the British Machine Vision Association, an honour recognizing excellence in AI technologies within computer vision. His contributions include the development and commercialization of multi-camera object association systems for industry. During his time as a research scientist at the Samsung AI Centre in Cambridge, Dr. Zhu pioneered sustainable AI algorithms for understanding visual content in images and videos. His work has garnered several best paper awards, and he has been recognized as one of the UK's and the world's best rising stars in science. Dr. Zhu's extensive research output includes over 120 articles in top-tier conferences and journals, with more than 17,000 citations and an H-index of 54. He actively contributes to the academic community through workshop organization, serving as a senior program committee member and area chair, and participating in panel debates on emerging trends in AI. Additionally, Dr. Zhu holds five US patents in the fields of AI and computer vision.

Jingbo Wang
Shanghai AI Lab, China

Capture, Generation, and Interaction, towards generalizable pedestrian simulation in driving scenarios.


Abstract

TBC

Biography (click to expand/collapse)

Dr.Jingbo Wang obtained his Ph.D. from The Chinese University of Hong Kong (MMLAB), supervised by Prof. Dahua Lin. Before that, he received his Master degree from Peking University in 2019, supervised by Prof. Gang Zeng, and his Bachelor degree from Beijing Institute of Technology in July 2016. He's interested in computer vision, deep learning, generative AI, character animation, and embodied AI. Most of his research is about generating realistic character animations as human in the real world. Before this, he also did research on scene understanding with efficient model (A.K.A BiseNet V1/V2) and multi-modality input.



Accepted Full Papers

AAT-DA: Accident Anticipation Transformer with Driver Attention. Yuto Kumamoto, Kento Ohtani, Daiki Suzuki, Minori Yamataka, Kazuya Takeda Snapshot: Towards Application-centered Models for Pedestrian Trajectory Prediction in Urban Traffic Environments. Nico Uhlemann, Yipeng Zhou, Tobias Mohr, Markus Lienkamp “What's Happening”- A Human-centered Multimodal Interpreter Explaining the Actions of Autonomous Vehicles. Xuewen Luo, Fan Ding, Rishikesh Panda, Ruiqi Chen, Junn Yong Loo; Shuyun Zhang Deep Learning-based rPPG Models towards Automotive Applications: A Benchmark Study. Tayssir Bouraffa, Dimitrios Koutsakis, Salvija Zelvyte


Organizers

Yihua Cheng
University of Birmingham
Zhongqun Zhang
University of Birmingham
Boeun Kim
University of Birmingham
Hubert P. H. Shum
Durham University
Yiannis Demiris
Imperial College London
Hyung Jin Chang
University of Birmingham

Program Committees

Zheming Zuo
University of Birmingham
Yuchen Zhou
Sun Yat-sen University
Hengfei Wang
University of Birmingham
Jungmin Lee
Korea Electronics Technology Institute (KETI)
Daeho Um
Samsung Electronics
Seulki Park
University of Michigan
Mingfang Zhang
The University of Tokyo

Workshop organized by:


Contacts:

Boeun Kim (b.e.kim@bham.ac.uk); Zhongqun Zhang (zxz064@student.bham.ac.uk)