Musical Instrument Tone Recognition Using DCT Based Feature Extraction And Gaussian Windowing

Linggo Sumarno


The conducted research studied a feature extraction method in a musical instrument tone recognition system. The purpose of this study was to obtain a number of feature extraction coefficients that are smaller than those obtained in previous studies. The studied feature extraction was a DCT (Discrete Cosine Transform)-based segment averaging and Gaussian windowing. The testing of the musical instrument's tone recognition system was carried out using pianica, tenor recorder, and bellyra musical instruments, each of which represented many, several, and one significant local peaks in the transform domain. The test results showed that the optimal number of feature extraction coefficient was 8 coefficients, which could give a recognition rate of up to 100%. The test results were achieved using a Gaussian window with a sigma value of 2-6, and a 128 points DCT.


Tone recognition; feature extraction; segment averaging; DCT; Gaussian window


Cheveigné, A de, & Kawahara, H. (2002). YIN, A Fundamental Frequency Estimator for Speech and Music. The Journal of the Acoustical Society of America. Pp. 111-117.

McLeod, P., & Wyvill, G. (2005, September). A Smarter Way to Find Pitch. In International Computer Music Conference (ICMC), Barcelona. Pp. 138–141.

Mitre, A., Queiroz, M., & Faria, R. (2006, May). Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates. In 4th Audio Engineering Society (AES) Brazil Conference, Sao Paulo. Pp. 113–118.

Pertusa, A., & Inesta, J.M. (2008, March-April). Multiple Fundamental Frequency Estimation using Gaussian Smoothness. In IEEE International Conference on Audio, Speech, and Signal Processing (ICASSP), Las Vegas. Pp. 105–108.

Yeh, C., Robel, A., & Rodet, X. (2010). Multiple Fundamental Frequency Estimation and Polyphony Inference of Polyphonic Music Signals. IEEE Transactions on Audio, Speech, and Language Processing. Vol. 18(6), pp. 1116–1126.

Duan, Z., Pardo, B., & Zhang, C. (2010). Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions. IEEE Transactions on Audio, Speech, and Language Processing. Vol. 18(8), pp 2121–2133.

Fruandta, A., & Buono, A. (2011, June). Identifikasi Campuran Nada pada Suara Piano Menggunakan Codebook. In Seminar Nasional Aplikasi Teknologi Informasi (SNATI). Universitas Islam Indonesia, Yogyakarta. Pp. G8–G13.

Tjahyanto, A., Suprapto, Y.K., & Wulandari, D,P. (2013). Spectral-based Features Ranking for Gamelan Instruments Identification using Filter Techniques. Telkomnika. Vol. 11(1), pp. 95–106.

Sumarno, L. (2016). On The Performace of Segment Averaging of Discrete Cosine Transform Coefficients on Musical Instruments Tone Recognition. ARPN Journal of Engineering and Applied Sciences. Vol. 11(9), pp. 5644–5649.

Sumarno, L., & Iswanjono. (2017). Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging. Telkomnika. Vol. 15(3), pp. 1280–1289.

Sumarno, L. (2018). Pengenalan Nada Alat Musik Menggunakan Ekstraksi Ciri Perataan Segmen Berbasis DST dan Pengklasifikasi SVM. Jurnal Teknologi. Vol 10(2), pp 23–31.

Tan, L., & Jiang, J. (2013). Digital Signal Processing Fundamentals and Applications. Second Edition. Elsevier Inc. Oxford. Pp. 15–56.

Meseguer, N.A. (2009). Speech Analysis for Automatic Speech Recognition. MSc Thesis. NTNU. Trondheim. Pp. 4–25.

Roberts, R.A., & Mullis, C.T. (1987). Digital Signal Processing. Addison-Wesley. Reading, Massachussets. Pp. 135–136.

Setiawan, Y.R. (2015). Pengenalan Ucapan Angka Menggunakan Transformasi Fourier Cepat dan Similaritas Kosinus. Skripsi. Universitas Sanata Dharma. Yogyakarta. Pp. 66–70.

Theodoridis, S., & Koutroumbas, K. (2009). Pattern Recognition. Fourth Edition. Elsevier Inc. San Diego, California. Pp. 481–519.

Zhu, S., Wu, J., Xiong, H., & Xia, G. (2011). Scaling up top-K similarity search. Data and Knowledge Engineering. Vol. 70, pp. 60–83.

Article Metrics

Abstract view : 196 times
PDF (Bahasa Indonesia) - 78 times


  • There are currently no refbacks.