Spectral Proximal Method and Saliency Matrix for Robust Deep Learning Optimization
Abstract
This paper presents a new training optimizer for deep learning models, called the Spectral Proximal (SP) method with saliency matrix, that aims to improve their ability to generalize to new data. Generalization is the measure of how well a model can perform on data that it has not seen during training. The SP method addresses a pair of hurdles affecting generalization: the problem of gradient confusion within complex model architectures and the limited availability of training data. The key innovation of the SP method is the use of a proximal operator with a saliency matrix, which adjusts the descent direction based on the importance of each parameter and avoids overfit issues. This leads to improved performance on image classification (MNIST and CIFAR-10) and object detection (YOLOv7) tasks and better ability to generalize to new data. We conducted a comprehensive inquiry by performing experiments on various configurations while controlling for potential confounding factors. The SP method consistently outperformed the baseline method based on the results.
Keywords
spectral proximal method, saliency matrix, Deep learning, machine vision, optimization algorithmThis work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
C. Yong, B. Kwan, D. Ng and H. Sim, "Spectral Proximal Method and Saliency Matrix for Robust Deep Learning Optimization," in Journal of Communications Software and Systems, vol. 20, no. 1, pp. 113-124, February 2024, doi: https://doi.org/10.24138/jcomss-2023-0124
@article{yong2024spectralproximal, author = {Cherng-Liin Yong and Ban-Hoe Kwan and Danny-Wee-Kiat Ng and Hong-Seng Sim}, title = {Spectral Proximal Method and Saliency Matrix for Robust Deep Learning Optimization}, journal = {Journal of Communications Software and Systems}, month = {2}, year = {2024}, volume = {20}, number = {1}, pages = {113--124}, doi = {https://doi.org/10.24138/jcomss-2023-0124}, url = {https://doi.org/https://doi.org/10.24138/jcomss-2023-0124} }