FROM THE JOURNAL

TIU Transactions on Inteligent Computing


Video Intelligence Agent for Human–AI Interaction


Nuthanakanti Bhaskar, Sai Tejaswini Vedula, Nikhil Adduri, Haroon Mohammad, Sekhar. B, Dr. Venkateshwarlu Naik
Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India

Abstract

This paper introduces a Video Intelligence Agent for Human AI Interaction that improves the quality of human and AI communication by visual perception. Conventional systems of interaction are based more on text or speech and hence it does not help it to understand human behavior in the real world too. In order to address this drawback, the system proposed studies continuous video input and understands human behavior, gestures, expressions, and the surrounding context. The framework uses a deep learning-based architecture which derives spatial features of video frames and captures temporal relationships to comprehend human intent into time. The attention mechanism is also provided in order to highlight the relevant visual cues to enhance interpretability and accuracy of response. The system facilitates real-time processing and context aware and adaptive responding to interaction. The experimental observations reveal that incorporation of visual intelligence enhances interaction effectiveness to a great extent in contrast to the traditional modes of interaction. The proposed Video Intelligence Agent shows how it is possible to involve visual perception in human-AI communication and provide the latter with a more natural, intuitive, and human-oriented interaction.

Keywords: Video Intelligence, Human–AI Interaction, Computer Vision, Deep Learning, Gesture Recognition, Facial Expression Analysis