A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

Published online: Jun 4, 2020
Full Text: PDF (921 KiB)
DOI: 10.24138/jcomss.v16i2.1032
Samuel Terra Vieira, Renata Lopes Rosa, Demóstenes Zegarra Rodríguez


Many factors can affect the users’ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users’ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL.


speech quality, objective metrics, Wireless Network, wired network, Deep learning, Tree-CNN
Creative Commons License 4.0
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.