NUBird2022: Annotated Bird Image and Audio Dataset

Open Source: NUBird2022: Annotated Bird Image and Audio Dataset

Overview

This is a dataset composed of multi-channel audio data and panorama image sequence recorded for approximately 8 minutes in a closed 3D environment with five Sunda zebra finches (Taeniopygia guttata), at a facility in Hokkaido University, located in Sapporo, Japan. Ground-truth bounding box surrounding each bird in the panorama image sequence is manually annotated.

`annotation`

output.json: Annotation json file
txt_anno: Converted text annotation files. Each file name corresponds to the frame number. Each file is composed of lines: x1, y1, x2, y2, difficult, occlusion, truncated.

`audio`

gopro_sample.wav: Audio data recorded by GoPro.
GS010046.wav: Multi-channel audio data recorded by TAMAGO-03.

`frame`

Separated video frames. Each file name corresponds to the frame number.

`video`

GS010046.360: Raw video recorded by GoPro.
GS010046.mp4: Converted mp4 video using GoPro tool.

Sample data

The following is a sample image from the frame images.

The following images are birds annotated as difficult, occlusion, and truncated, and normal (no difficulty tag), respectively.

Download

Since the size of the dataset is large, please contact us if you wish to download the data.
License: CC BY-SA 4.0

Change Log

Nov. 18, 2022: Web page opened

Citation

Please cite the following papers when publishing your research using this dataset (Paper 1, when only using the video and/or audio, and Papers 1 and 2, when using the annnotations):

Shinji Sumitani, Reiji Suzuki, Takaya Arita, Kazuhiro Nakadai, and Hiroshi G. Okuno: “Non-invasive monitoring of the spatio-temporal dynamics of vocalizations among songbirds in a semi free-flight environment using robot audition techniques”, Birds, 2(2): 158-172, April 2021.

@article{Sumitani:Birds2021,
author = {Sumitani, Shinji and Suzuki, Reiji and Arita, Takaya and Nakadai, Kazuhiro and Okuno, Hiroshi},
title = {Non-invasive monitoring of the spatio temporal dynamics of vocalizations among songbirds in a semi free flight environment using robot audition techniques},
journal = {Birds},
volume = {2},
number = {2},
pages = {158--172},
month = {April},
year = {2021}
}

Yasutomo Kawanishi, Ichiro Ide, Baidong Chu, Chihaya Matsuhira, Marc A. Kastner, Takahiro Komamizu, and Daisuke Deguchi: “Detection of birds in a 3D environment referring to audio-visual information”, Proc. 18th IEEE Int. Conf. on Advanced Video and Signal-based Surveillance (AVSS2022), To be published in Nov. 2022.

@inproceedings{Kawanishi:AVSS2022,
author = {Kawanishi, Yasutomo and Ide, Ichiro and Chu, Baidong and Matsuhira, Chihaya and Kastner, Marc A. and Komamizu, Takahiro and Deguchi, Daisuke},
title = {Detection of birds in a 3D environment referring to audio-visual information},
booktitle = {Proc. 18th IEEE Int. Conf. on Advanced Video and Signal-based Surveillance (AVSS2022)},
pages = {TBA},
month = {December},
year = {2022},
address = {Madrid, Spain / Online}
}

Acknowledgement

This dataset was prepared as parts of JSPS/MEXT Grants-in-aid for Scientific Research JP21K12058, JP20J13695, JP20H00475, JP19KK0260, JP18K11467, and JP17H06383 in #4903 (Evolinguistics).
The audio and video data were recorded in cooperation with Professor Kazuhiro WADA at Graduate School of Science, Hokkaido University. Animal experiments were conducted under the guidelines and approval of the Committee on Animal Experiments of Hokkaido University. These guidelines are based on the national regulations for animal welfare in Japan (Law for the Humane Treatment and Management of Animals with partial amendment No.105, 2011).

Contact

faculty+ [at] cs [dot] is [dot] i [dot] nagoya-u [dot] ac [dot] jp

Contents Science Lab