Image captioning considering imageability

Project: Image captioning considering imageability

In this research, we aim to generate image captions tailored to the actual usage of image captions. To achieve this, we consider psycholinguistic measurements during the generation of captions.

For example, in order for visually impaired people to understand the image content, captions that describe the image content in as much detail as possible are preferred. On the other hand, for images in news articles, captions that include the content of the news article are preferred to the description of the image content.

There are various situations in image captions used in the real world, and the desired properties differ depending on each application. Aiming to generate captions according to these, Ide Laboratory is working on image captioning that freely adjusts the details of the explanation of the image contents.

Therefore, consider “Imageability”, which is a measurement of showing the ease of imagining a word’s content. Including this into caption generation, it is possible to tailor the caption to an intended degree of visualness. In the resulting model, if you input an image into the caption model and specify a low value, a concise caption will be generated. In contrast, if you specify a high value, a caption that describes the image content in detail will be generated.

Main project members

Kazuki Umemura

Finished Master in AY2020

Dr. Marc A. Kastner

Cooperative Research Fellow (Hiroshima City University)

Recent publications

Imageability- and length-controllable image captioning

Marc A. Kastner, Kazuki Umemura, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase, Shin'ichi Satoh
IEEE Access, vol.9, pp.162951-162961, November 2021.

Tell as you imagine: Sentence imageability-aware image captioning

Kazuki Umemura, Marc A. Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase
MultiMedia Modeling -27th Int. Conf., MMM2021, Prague, Czech Republic, June 22-24, 2021, Procs., Part II, Jakub Lokoč, Tomáš Skopal, Klaus Schoeffmann, Vasileios Mezaris, Xirong Li, Stefanos Vrochidis, Ioannis Patraseds., Lecture Notes in Computer Science, vol. 12573, pp.62-73, Online, June 2021.

TBA (in Japanese)

Kazuki Umemura, Marc Aurel Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase
Meeting on Image Recognition and Understanding (MIRU) 2020, no.IS3-2-1, Online, August 2020.

A study on image captioning considering its imageability (in Japanese)

Kazuki Umemura, Marc Aurel Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase
IEICE Tech. Rep. Media Experience and Virtual Environment, MVE2019-69; MVE Award, Online, March 2020.

Estimating the imageability of a sentence for image caption evaluation

Kazuki Umemura, Marc Aurel Kastner, Ichiro Ide, Yasutomo Kawanishi, Daisuke Deguchi, Hiroshi Murase
Japan-Taiwan Joint Workshop on Multimedia and HCI, National Cheng-Kung University (Tainan, Taiwan), April 2019.

TBA (in Japanese)

Kazuki Umemura, Marc Aurel Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase
Proc. 25th ANLP Annual Meeting, no.A4-9; pp.755-758, Nagoya Univ, March 2019.

Contents Science Lab

Project: Image captioning considering imageability

Main project members

Kazuki Umemura

Dr. Marc A. Kastner

Recent publications

Imageability- and length-controllable image captioning

Tell as you imagine: Sentence imageability-aware image captioning

TBA (in Japanese)

A study on image captioning considering its imageability (in Japanese)

Estimating the imageability of a sentence for image caption evaluation

TBA (in Japanese)