QuakerBot: A household dialog system powered by large language models

Alexa Prize TaskBot Challenge Proceedings, 2022

Abstract. We describe QuakerBot, a dialog system that helps users with household tasks and a participant in the Alexa Prize TaskBot Challenge. QuakerBot can process a variety of user requests, search for instructions from web resources such as wikiHow or Whole Foods Market recipes, answer related questions, and so on. Its components simultaneously consist of large language models with an impressive few-shot performance, and rule-based models with robust service.

Recommended citation: Panagopoulou, Artemis, et al. "QuakerBot: A household dialog system powered by large language models" Alexa Prize TaskBot Challenge Proceedings (2022)

Visual Goal-Step Inference using wikiHow

EMNLP 2021 (Oral), 2021

Abstract. Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities. Past work in NLP has examined the task of goal-step inference for text. We intro- duce the visual analogue. We propose the Visual Goal-Step Inference (VGSI) task, where a model is given a textual goal and must choose which of four images represents a plausible step towards that goal. With a new dataset harvested from wikiHow consisting of 772,277 images representing human actions, we show that our task is challenging for state-of-the- art multimodal models. Moreover, the mul- timodal representation learned from our data can be effectively transferred to other datasets like HowTo100m, increasing the VGSI accuracy by 15 - 20%. Our task will facilitate mul- timodal reasoning about procedural events.

Recommended citation: Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch (2021). "Visual Goal-Step Inference using wikiHow." EMNLP 2021.

Self-Supervised Optical Flow with Spiking Neural Networks and Event Based Cameras

IROS 2021, 2021

Abstract. Optical flow can be leveraged in robotic systems for obstacle detection where low latency solutions are critical in highly dynamic settings. While event-based cameras have changed the dominant paradigm of sending by encoding stimuli into spike trails, offering low bandwidth and latency, events are still processed with traditional convolutional networks in GPUs defeating, thus, the promise of efficient low capacity low power processing that inspired the design of event sensors. In this work, we introduce a shallow spiking neural network for the computation of optical flow consisting of Leaky Integrate and Fire neurons. Optical flow is predicted as the synthesis of motion orientation selective channels. Learning is accomplished by Backpropapagation Through Time. We present promising results on events recorded in real “in the wild” scenes that has the capability to use only a small fraction of the energy consumed in CNNs deployed on GPUs.

Recommended citation: Kenneth Chaney, Artemis Panagopoulou, Chankyu Lee, Kaushik Roy, and Kostas Daniilidis (2021). "Self-Supervised Optical Flow with Spiking Neural Networks and Event Based Cameras." IROS 2021.