Attentive Contextual Network for Image Captioning

Prudviraj, J. and Mohan, C. K. and et al, . (2021) Attentive Contextual Network for Image Captioning. In: 2021 International Joint Conference on Neural Networks (IJCNN), 18 July 2021 through 22 July 2021, Virtual, Shenzhen.

Full text not available from this repository. (Request a copy)


Existing image captioning approaches fail to generate fine-grained captions due to the lack of rich encoding representation of an image. In this paper, we present an attentive contextual network (ACN) to learn the spatially transformed image features and dense multi-scale contextual information of an image to generate semantically meaningful captions. At first, we construct deformable network on intermediate layers of convolutional neural network (CNN) to cultivate spatial invariant features. And the multi-scale contextual features are produced by employing contextual network on top of last layers of CNN. Then, we exploit attention mechanism on contextual network to extract dense contextual features. Further, the extracted spatial and contextual features are combined to encode the holistic representation of an image. Finally, a multi-stage caption decoder with visual attention module is incorporated to generate fine-grained captions. The performance of the proposed approach is demonstrated on COCO dataset, the largest dataset for image captioning.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Mohan, Chalavadi Krishna 0000-0001-8342-0083
Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: attention mechanism, contextual network, deformable network, Image captioning, multi-stage LSTM
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Mrs Haseena VKKM
Date Deposited: 16 Nov 2021 11:36
Last Modified: 18 Feb 2022 10:26
Publisher URL:
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 8975 Statistics for this ePrint Item