An overview for assessing a number of systems for estimating age and gender of speakers

Aalaa Ahmed Mohammed; Yusra Faisal Al-Irhayim

doi:10.25130/tjps.v26i1.106

pdf

Published: Dec 3, 2022

DOI: https://doi.org/10.25130/tjps.v26i1.106

Keywords:

assessing a number, systems, age, gender, speakers

Aalaa Ahmed Mohammed

Yusra Faisal Al-Irhayim

Abstract

The determination of the age and gender of the speaker of the speech signal is an interesting topic in the interaction between human-machine. Speech signal has a variety of applications ranging from speech analyses to allocate human-machine interactions. This paper aims to conduct a comparative study of age and gender classification algorithms applied to the speech signal. Comparison of experimental results of different sources of voices for speakers of different languages and methods of miscellaneous classification such as Bayes classifier, neural network, support vector machines, K-nearest neighbor, gaussien mixture model and hybrid method based on weighted analysis of a directed non-negative matrix and a neural network with a general recession as well as some deep learning methods, is done in order to show different results to classify the age and gender of the speaker when processing the speech signal. The study showed that methods and algorithms of deep learning have excelled in providing accuracy ratios higher than other methods, and it shows that the hybridization of two or more classification methods increases the accuracy level of the results.

How to Cite

Aalaa Ahmed Mohammed, & Yusra Faisal Al-Irhayim. (2022). An overview for assessing a number of systems for estimating age and gender of speakers. Tikrit Journal of Pure Science, 26(1), 101–107. https://doi.org/10.25130/tjps.v26i1.106

Issue

Vol. 26 No. 1 (2021)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Tikrit Journal of Pure Science is licensed under the Creative Commons Attribution 4.0 International License, which allows users to copy, create extracts, abstracts, and new works from the article, alter and revise the article, and make commercial use of the article (including reuse and/or resale of the article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made, and the licensor is not represented as endorsing the use made of the work. The authors hold the copyright for their published work on the Tikrit J. Pure Sci. website, while Tikrit J. Pure Sci. is responsible for appreciate citation of their work, which is released under CC-BY-4.0, enabling the unrestricted use, distribution, and reproduction of an article in any medium, provided that the original work is properly cited.

References

[1] Abu Mallouh, A. (2017). A framework for enhancing speaker age and gender classification by using a new feature set and deep neural network architectures. Ph.D. thesis, The School of Engineering, University of Bridgeport, United States: 95 pp. [2] Alkhawaldeh, R. S. (2019). DGR: Gender recognition of human speech using one-dimensional conventional neural network. Hindawi. Scientific Programming, (2019)7213717:1-12.

[3] Sedaaghi, M. H. (2009). A comparative study of gender and age classification in speech signals. Iranian Journal of Electrical & Electronic Engineering, (5) 1:1-12.

[4] Bahari, M. H. and Hamme, H. V. (2011). Speaker age estimation and gender detection based on supervised non-negative matrix factorization. BIOMS: Milan: 1-6 pp.

[5] Sinha, P. (2010). Speech processing in embedded systems. Springer New York Dordrecht Heidelberg: London: 177 pp.

[6] Hernandez, M. J. (2016). A tutorial to extract the pitch in speech signals using autocorrelation. Open Journal of Technology & Engineering Disciplines (OJTED), (2)1: 01-10.

[7] Kanabur,V.; Harakannanavar, S. and Torse, D. (2019). An extensive review of feature extraction techniques, challenges and trends in automatic speech recognition. International Journal of Image, Graphics and Signal Processing, (11)5: 1-12.

[8] Joshi, S.; Kumari, A.; Pai, P.; Sangaonkar, S. and D’Souza, M. (2017). Voice Recognition System. Journal for Research, (03) 01: 6-9.

[9] Kumar, P. M. (2016). A new human voice recognition system. Asian Journal of Science and Applied Technology, (5)2: 23-30.

[10] Chaudhari, S. and Kagalkar, R. (2014). A review of automatic speaker age classification, recognition and identifying speaker emotion using voice signal. International Journal of Science and Research (IJSR), (3)11: 1307-1311.

[11] Mavaddati, S. (2018). Voice-based age and gender recognition based on learning generative sparse models. International Journal of Engineering IJE Transactions, (31)9: 1529-1535.

[12] Faek, F. K. (2015). Objective gender and age recognition from speech sentences. ARO-The Scientific Journal of Koya University, (III)2: 24-29.

[13] Keerio, A.; Mitra, B. K.; Birch, P.; Young, R.; and Chatwin, C. (2009). On preprocessing of speech signals. International Journal of Signal Processing, (5)3: 216-222.

[14] Ranjan, R. and Thakur, A. (2019). Analysis of feature extraction techniques for speech recognition system. International Journal of Innovative Technology and Exploring Engineering, (8)7C2: 197-200.

[15] Tebelskis, J. (1995). Speech recognition using neural networks. Ph.D thesis in computer science, The School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania:190 pp.

[16] Feng, L. (2004). Speaker recognition. M.Sc. thesis, The Intelligent Signal Processing group at Institute of Informatics and Mathematical Modelling, Technical University, Denmark: 112 pp.

[17] Sedaaghi, M. H. (2009). A comparative study of gender and age classification in speech signals. Iranian Journal of Electrical & Electronic Engineering, (5)1: 1-12.

[18] Anusuya, M. A. and Katti, S. K. (2009). Speech recognition by machine: A review. International Journal of Computer Science and Information Security, (6)3: 181-205.

[19] Leuschner, J. et.al. (2019). Supervised non-negative matrix factorization methods for MALDI imaging applications. Bioinformatics, (35)11: 1940-1947.

[20] Al-mahasneh, A. J.; Anavatti, S. G. and Garratt, M. A. (2018). Review of application of generalized regression neural networks in identification and control of dynamic systems. arXiv, (abs/1805.11236): 5 pp.

[21] Doukhan, D.; Carrive, J.; Vallet, F.; Larcher, A. and Meignier, S. (2018). An open-source speaker gender detection framework for monitoring gender equality. IEEE International Conference on Acoustic Speech and Signal Processing, April 2018, Calgary, Canada: 5 pp.

[22] Zimeng, H. (2017). Speaker gender recognition system. M.Sc. thesis, University of Oulu, Oulu, Finland: 54 pp.

[23] Li, M.; Han, K. J. and Navayanan, S. (2013). Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Elsevier, Computer Speech and Language, (27):151–167.

[24] Kim, H.; Bae, K. and Yoon, H. (2007). Age and gender classification for a home-robot service. 16th IEEE International Conference on Robot & Human Interactive Communication, Jeju, Korea: p. 122-126.

[25] Piel, L. K. (2018). Speech-based identification of children's gender and age with neural networks. M.Sc. thesis, Tallinn University of Technology, Tallinn, Estonia: 85 pp.

Article Sidebar

Main Article Content

Abstract

Article Details

References