Towards Automatic Bird Detection: An Annotated and Segmented Acoustic Dataset of Seven Picidae Species
Published: 16 May 2017 by MDPI in Data
MDPI, Volume 2; 10.3390/data2020018
Abstract: Analysing behavioural patterns of bird species in a certain region enables researchers to recognize forthcoming changes in environment, ecology, and population. Ornithologists spend many hours observing and recording birds in their natural habitat to compare different audio samples and extract valuable insights. This manual process is typically undertaken by highly-experienced birders that identify every species and its associated type of sound. In recent years, some public repositories hosting labelled acoustic samples from different bird species have emerged, which has resulted in appealing datasets that computer scientists can use to test the accuracy of their machine learning algorithms and assist ornithologists in the time-consuming process of analyzing audio data. Current limitations in the performance of these algorithms come from the fact that the acoustic samples of these datasets combine fragments with only environmental noise and fragments with the bird sound (i.e., the computer confuses environmental sound with the bird sound). Therefore, the purpose of this paper is to release a dataset lasting more than 4984 s that contains differentiated samples of (1) bird sounds and (2) environmental sounds. This data descriptor releases the processed audio samples—originally obtained from the Xeno-Canto repository—from the known seven families of the Picidae species inhabiting the Iberian Peninsula that are good indicators of the habitat quality and have significant value from the environment conservation point of view.
Keywords: Birdsong, Acoustic Bird Recognition, Audio Sample, dataset, Birdsong Dataset