Visual question answering : from theory to application / Qi Wu, Peng Wang, Xin Wang, Xiaodong He, Wenwu Zhu.
2022
TA1634
Linked e-resources
Linked Resource
Online Access
Concurrent users
Unlimited
Authorized users
Authorized users
Document Delivery Supplied
Can lend chapters, not whole ebooks
Details
Title
Visual question answering : from theory to application / Qi Wu, Peng Wang, Xin Wang, Xiaodong He, Wenwu Zhu.
Author
Wu, Qi, author.
ISBN
9789811909641 (electronic bk.)
9811909644 (electronic bk.)
9789811909634 (print)
9811909636
9811909644 (electronic bk.)
9789811909634 (print)
9811909636
Published
Singapore : Springer, 2022.
Language
English
Description
1 online resource (xiii, 238 pages) : illustrations (some color).
Item Number
10.1007/978-981-19-0964-1 doi
Call Number
TA1634
Dewey Decimal Classification
006.3/7
Summary
Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output. This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc. Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging. This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, and promising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA.
Bibliography, etc. Note
Includes bibliographical references and index.
Access Note
Access limited to authorized users.
Source of Description
Online resource; title from PDF title page (SpringerLink, viewed May 20, 2022).
Series
Advances in computer vision and pattern recognition, 2191-6594
Available in Other Form
Print version: 9789811909634
Linked Resources
Online Access
Record Appears in
Online Resources > Ebooks
All Resources
All Resources
Table of Contents
1. Introduction
2. Deep Learning Basics
3. Question Answering (QA) Basics
4. The Classical Visual Question Answering
5. Knowledge-based VQA.
2. Deep Learning Basics
3. Question Answering (QA) Basics
4. The Classical Visual Question Answering
5. Knowledge-based VQA.