Paper Title: Leveraging Computer Vision and Natural Language Processing for Object Detection and Localization Authors: B. Rahmani 1, S. Bhavanasi 1, A. Maazallahi 2, H. S. Korapala 1, J. L. S. Yenugu 1, Y.K. Bhatia 1, M.A. Salari 1, E. Snir 3, P. Norouzzadeh 1, J. Fritts 1, 1 Saint Louis University, USA, 2 University of Tehran, Iran, 3 Washington University in Saint Louis, USA Abstract: This paper presents a novel approach leveraging the integration of Computer Vision, Natural Language Processing (NLP), and Speech Recognition technologies to create an AI-powered system capable of detecting and locating objects through voice commands. The system developed using Flask, OpenCV, spaCy, and Blip VQA Model, aims to assist caregivers, visually impaired individuals, and the elderly in various daily tasks, such as locating items in the home and checking for potential hazards like leaving the stove on. We also provide the code used for this project1 Volume URL: https://airccse.com/ora...