Deep Model Compression: Distilling Knowledge from Noisy Teachers

Sau, B B and Balasubramanian, Vineeth N (2016) Deep Model Compression: Distilling Knowledge from Noisy Teachers. arXiv, v2. pp. 1-9.

1610.09650.pdf - Accepted Version

Download (323kB) | Preview


The remarkable successes of deep learning models across various applications have resulted in the design of deeper networks that can solve complex problems. How- ever, the increasing depth of such models also results in a higher storage and runtime complexity, which restricts the deployability of such very deep models on mobile and portable devices, which have limited storage and battery capacity. While many methods have been proposed for deep model compression in recent years, almost all of them have focused on reducing storage complexity. In this work, we extend the teacher-student framework for deep model com- pression, since it has the potential to address runtime and train time complexity too. We propose a simple method- ology to include a noise-based regularizer while training the student from the teacher, which provides a healthy im- provement in the performance of the student network. Our experiments on the CIFAR-10, SVHN and MNIST datasets show promising improvement, with the best performance on the CIFAR-10 dataset. We also conduct a comprehensive empirical evaluation of the proposed method under related settings on the CIFAR-10 dataset to show the promise of the proposed approach.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Balasubramanian, Vineeth NUNSPECIFIED
Item Type: Article
Uncontrolled Keywords: Deep Learning, Model Compression, Teacher-Student Learning, Regularization, Noise
Subjects: Computer science > Big Data Analytics
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 17 Nov 2016 09:06
Last Modified: 25 Apr 2018 05:44
Publisher URL:
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 2877 Statistics for this ePrint Item