• Home
  • Browse
    • Current Issue
    • By Issue
    • By Author
    • By Subject
    • Author Index
    • Keyword Index
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Publication Ethics
    • Peer Review Process
  • Guide for Authors
  • Submit Manuscript
  • Contact Us
 
  • Login
  • Register
Home Articles List Article Information
  • Save Records
  • |
  • Printable Version
  • |
  • Recommend
  • |
  • How to cite Export to
    RIS EndNote BibTeX APA MLA Harvard Vancouver
  • |
  • Share Share
    CiteULike Mendeley Facebook Google LinkedIn Twitter
The Egyptian Journal of Language Engineering
arrow Articles in Press
arrow Current Issue
Journal Archive
Volume Volume 11 (2024)
Issue Issue 2
Issue Issue 1
Volume Volume 10 (2023)
Volume Volume 9 (2022)
Volume Volume 8 (2021)
Volume Volume 7 (2020)
Volume Volume 6 (2019)
Volume Volume 5 (2018)
Volume Volume 4 (2017)
Volume Volume 3 (2016)
Volume Volume 2 (2015)
Volume Volume 1 (2014)
Al-Zoghby, A., Saleh, A., awad, W. (2024). A Survey on Visual Question Answering Methodologies. The Egyptian Journal of Language Engineering, 11(1), 57-65. doi: 10.21608/ejle.2024.244720.1058
Aya M. Al-Zoghby; Aya Salah Saleh; wael abd elkader awad. "A Survey on Visual Question Answering Methodologies". The Egyptian Journal of Language Engineering, 11, 1, 2024, 57-65. doi: 10.21608/ejle.2024.244720.1058
Al-Zoghby, A., Saleh, A., awad, W. (2024). 'A Survey on Visual Question Answering Methodologies', The Egyptian Journal of Language Engineering, 11(1), pp. 57-65. doi: 10.21608/ejle.2024.244720.1058
Al-Zoghby, A., Saleh, A., awad, W. A Survey on Visual Question Answering Methodologies. The Egyptian Journal of Language Engineering, 2024; 11(1): 57-65. doi: 10.21608/ejle.2024.244720.1058

A Survey on Visual Question Answering Methodologies

Article 4, Volume 11, Issue 1, April 2024, Page 57-65  XML PDF (902.91 K)
Document Type: Original Article
DOI: 10.21608/ejle.2024.244720.1058
View on SCiNiTO View on SCiNiTO
Authors
Aya M. Al-Zoghby1; Aya Salah Saleh email orcid 2; wael abd elkader awad3
1Department of Computer Science, Faculty of Computers and Information Science Damietta University Damietta, Egypt
2Computer Science,Computer and Artificial Intelligence, Damietta University, New Damietta, Damietta
3Computer Science Department, Faculty of Computer and Artificial Intelligence, Damietta University
Abstract
Understanding visual question-answering (VQA) will be essential for many human tasks. However, it poses significant obstacles at the core of artificial intelligence as a multimodal system. This article provides a summary of the challenges in multimodal architectures that have lately been demonstrated by the enormous rise in research. We need to keep our eyes on these challenges to enhance the design of visual question-answering systems. Then we will introduce the recent rapid developments in methods for answering visual questions with images. Providing the right response to a natural language question concerning an input image, it is a difficult multi-modal activity as we don’t need only to extract features from both modal (text and image) but also getting attention on relation between them. Many deep learning researchers are drawn to it because of their outstanding contributions to text, voice, and vision technologies (images and videos) in fields like welfare, robotics, security, and medicine, etc.
Keywords
Deep Learning; Visual question answering; Multimodal challenges; VQA methodologies
Statistics
Article View: 454
PDF Download: 382
Home | Glossary | News | Aims and Scope | Sitemap
Top Top

Journal Management System. Designed by NotionWave.