Multimodal Systems

PI: João Magalhães

The Multimodal Systems group aims to advance algorithms and tools that close the gap between human needs and computational systems. To fulfill this ambition, the MS group pursues three complimentary research streams.

Bringing the new generation of Large Language Models and Large Vision and Language Models (LLMs and LVLMs) closer to the way humans’ reason is the driving force behind the first research stream of the MS group. We research new foundational LLMs/LVLMs that are more controllable and trustable, by solving the factual consistency of generative models with retrieval-augmented LVLMs. In addition, we aim to address theory-of-mind problems in physical collaborative settings. Solving these challenges will introduce a step-change in Vision and Language AI.

The second research stream pursues Mixed Reality methodologies and tools for Cultural Heritage. Interactive technologies take a central role, and we research virtual representations of cultural artifacts and make them widely used by expert and non-expert users. This is further specialized as new interaction and collaboration techniques that such an infrastructure affords. Hence, leveraging interactive technologies, and V&L AI, we pursue a richer paradigm of accessing and interacting with cultural artifacts and spaces.

The third research stream leverages the group expertise in rehabilitation technologies, and seeks real-world impact. We research simulation-based training tools to improve the diagnostic capabilities and empathic communication skills of healthcare professionals. In addition, research how motor and cognitive functions answer real-world activities such as Instrumental Activities of Daily Living. Through these efforts, our vision is to drive innovation in rehabilitation and healthcare.

The MS group is regularly engaged in the organization of top conferences in the area and has a consistent record of collaborations with the industry (BBC R&D, Amazon Science, Google, Farfetch) and academia (CMU, Queen Mary, UTAustin). Research outcomes are published at major venues in the area, e.g. ACM MM, ACM SIGCHI, ACL, CVPR, WSDM, ECIR, IUI, UIST.