J. Imaging, Vol. 11, Pages 281: Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations

Greenberg August 21, 2025 in News - 1 Minute

J. Imaging, Vol. 11, Pages 281: Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations

Journal of Imaging doi: 10.3390/jimaging11080281

Authors:
Pafan Doungpaisan
Peerapol Khunarsa

Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time&ndash;frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures&mdash;ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2&mdash;were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time&ndash;frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral&ndash;temporal patterns, providing a robust framework for accurate firearm sound classification.

Source link

Pafan Doungpaisan www.mdpi.com

Greenberg

Learn More →

Related Posts

Healthcare, Vol. 13, Pages 2372: Impact of COVID-19 on A1c Management and Telehealth Use Among a Type 2 Diabetes Mellitus Population in the Outpatient Setting

Climate, Vol. 13, Pages 198: Projected 21st Century Increased Water Stress in the Athabasca River Basin: The Center of Canada&rsquo;s Oil Sands Industry

Electronics, Vol. 14, Pages 3736: Design and Analysis of Compact K/Ka-Band CMOS Four-Way Power Splitters for K/Ka-Band LEO Satellite Communications and 28/39 GHz 5G NR

Greenberg

Climate, Vol. 13, Pages 198: Projected 21st Century Increased Water Stress in the Athabasca River Basin: The Center of Canada’s Oil Sands Industry