A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

Terra Vieira, Samuel; Lopes Rosa, Renata; Zegarra Rodríguez, Demóstenes

doi:10.24138/jcomss.v16i2.1032

A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

Published online: Jun 4, 2020 Full Text: PDF (921 KiB) DOI: 10.24138/jcomss.v16i2.1032

Cite this paper

Authors:

Samuel Terra Vieira, Renata Lopes Rosa, Demóstenes Zegarra Rodríguez

Abstract

Many factors can affect the users’ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users’ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL.

Keywords

speech quality, objective metrics, Wireless Network, wired network, Deep learning, Tree-CNN

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.