University of Hamburg

Will multimodal large language models ever achieve deep understanding of the world?

Despite impressive performance in various tasks, large language models (LLMs) are subject to the symbol grounding problem, so from the …

Igor Farkaš, Michal Vavrečka, Stefan Wermter

Pointing-Guided Target Estimation via Transformer-Based Attention

Deictic gestures, like pointing, are a fundamental form of non-verbal communication, enabling humans to direct attention to specific …

Luca Müller, Hassan Ali, Philipp Allgeuer, Lukáš Gajdošech, Stefan Wermter

LLM-based Interactive Imitation Learning for Robotic Manipulation

Recent advancements in machine learning provide methods to train autonomous agents capable of handling the increasing complexity of …

Jonas Werner, Kun Chu, Cornelius Weber, Stefan Wermter

Comparing Apples to Oranges: LLM-Powered Multimodal Intention Prediction in an Object Categorization Task

Human intention-based systems enable robots to perceive and interpret user actions to interact with humans and adapt to their behavior …

Hassan Ali, Philipp Allgeuer, Stefan Wermter

Open-Vocabulary Robotic Object Manipulation using Foundation Models

Classical vision-language-action models are limited by unidirectional communication, hindering natural human-robot interaction. The …

Stig Griebenow, Ozan Özdemir, Cornelius Weber, Stefan Wermter

Diffusing in Someone Else’s Shoes: Robotic Perspective-Taking with Diffusion

Humanoid robots can benefit from their similarity to the human shape by learning from humans. When humans teach other humans how to …

Josua Spisak, Matthias Kerzel, Stefan Wermter

Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation

Large Language Models (LLMs) have been recently used in robot applications for grounding LLM commonsense reasoning with the robot’s …

Hassan Ali, Philipp Allgeuer, Carlo Mazzola, Guilia Belgiovine, Burak Can Kaplan, Lukáš Gajdošech, Stefan Wermter

When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration

We investigate the use of Large Language Models (LLMs) to equip neural robotic agents with human-like social and cognitive …

Philipp Allgeuer, Hassan Ali, Stefan Wermter

Robotic Imitation of Human Actions

Imitation can allow us to quickly gain an understanding of a new task. Through a demonstration, we can gain direct knowledge about …

Josua Spisak, Matthias Kenzel, Stefan Wermter

NICOL: A Neuro-inspired Collaborative Semi-humanoid Robot that Bridges Social Interaction and Reliable Manipulation

Robotic platforms that can efficiently collaborate with humans in physical tasks constitute a major goal in robotics. However, many …

Matthias Kerzel, Philipp Allgeuer, Erik Strahl, Nicolas Frick, Jan-Gerrit Habekost, Manfred Eppe, Stefan Wermter