Spectral Proximal Method and Saliency Matrix for Robust Deep Learning Optimization

Published online: Feb 26, 2024 Full Text: PDF (7.95 MiB) DOI: https://doi.org/10.24138/jcomss-2023-0124
Cite this paper
Authors:
Cherng-Liin Yong, Ban-Hoe Kwan, Danny-Wee-Kiat Ng, Hong-Seng Sim

Abstract

This paper presents a new training optimizer for deep learning models, called the Spectral Proximal (SP) method with saliency matrix, that aims to improve their ability to generalize to new data. Generalization is the measure of how well a model can perform on data that it has not seen during training. The SP method addresses a pair of hurdles affecting generalization: the problem of gradient confusion within complex model architectures and the limited availability of training data. The key innovation of the SP method is the use of a proximal operator with a saliency matrix, which adjusts the descent direction based on the importance of each parameter and avoids overfit issues. This leads to improved performance on image classification (MNIST and CIFAR-10) and object detection (YOLOv7) tasks and better ability to generalize to new data. We conducted a comprehensive inquiry by performing experiments on various configurations while controlling for potential confounding factors. The SP method consistently outperformed the baseline method based on the results.

Keywords

spectral proximal method, saliency matrix, Deep learning, machine vision, optimization algorithm
Creative Commons License 4.0
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.