BMVC 2025 Workshop
MPI: Multisensory Intelligence for Human Perception
Location: Cutlers’ Hall, Sheffield, UK
Overview
In recent years, substantial advancements have been made across various domains of artificial intelligence, including vision and language models, audio processing, and robotics. However, these developments have largely occurred in isolation within their respective fields. While multimodal learning has garnered increasing attention, current approaches predominantly focus on a limited set of sensory modalities, namely vision, text, and occasionally audio. This restricted scope fails to encompass the full range of human sensory experience. A critical gap in contemporary AI research lies in the integration of diverse sensory modalities—extending beyond the conventional vision-language pairings—towards a more comprehensive, human-like understanding of the world. Human perception is inherently multisensory, seamlessly integrating inputs from touch, smell, taste, and interoception, alongside external senses such as sight and hearing. These additional modalities are not supplementary; rather, they are integral to how we interpret and engage with the world. Furthermore, the neuroscientific foundations of perception, as revealed through neural recordings like EEG, fMRI, and MEG, offer a valuable, yet underexplored, perspective for grounding AI systems in biologically-inspired mechanisms.
The goal of this workshop is to catalyze a paradigm shift in the AI community’s approach to perception: from a narrowly multimodal perspective to a truly multisensory one. We seek to highlight recent advancements in specialized domains, including computational olfaction, haptic learning, gustatory modeling, and neuro-symbolic integration, while bringing together researchers from diverse fields who may otherwise operate in isolation. In doing so, we aim to facilitate the cross-pollination of ideas and foster the development of AI systems that more closely mirror the richness and complexity of human perception.
Keynote Speakers
Workshop Schedule
Time (BST) | Section | Speaker |
---|---|---|
14:00 - 14:10 | Welcome & Opening Remarks | Weihao Xia |
14:10 - 14:50 | Keynote Talk: Egocentric Video Understanding Out of the Frame | Prof. Dima Damen |
14:50 - 15:30 | Keynote Talk: Tactile Sensing in a Multimodal Embodied Intelligence Era | Prof. Shan Luo |
15:30 - 16:15 | Coffee Break & Poster Section | |
16:20 - 17:10 | Keynote Talk: Modeling the Physical World from Raw Multisensory Observations | Prof. Shangzhe Wu |
17:10 - 17:30 | Oral Presentation | TBD |
17:30 - 17:50 | Oral Presentation | TBD |
17:50 - 17:55 | Closing Remarks | Weihao Xia |
Call for Papers
We welcome research on Multisensory Intelligence and Multimodal Learning, including (but not limited to) following topics:
- multisensory data collection: touch (tactile), smell (olfactory), and taste (gustatory), with visual or textual associations;
- computational modeling of underexplored senses: representation learning, multimodal large language models (MLLMs);
- alignment between sensory data and human perception: visual-tactile learning, multimodal brain encoding and decoding, linguistic-olfactory/gustatory learning;
- sensory fusion and integration: handling noisy and heterogeneous data, and integrating with artificial agents;
- multisensory applications: robotics, assistive technology, healthcare, and AR/VR;
- ethical and design considerations for multisensory intelligent systems with human-like perception.
Submission: Papers can be submitted at OpenReview. The format should follow BMVC Author Guidelines for double-blind review.
Reviewer: We welcome Program Committee from difference backgrounds. Self-nominate in the Call for Reviewers Form.
Important Dates
Submission deadline: 31 August, 2025
Notification of acceptance: 15 September, 2025
Camera-ready submission: 22 September, 2025
Workshop date: 27 November 2025, Sheffield, UK