Knowledge Distillation from Multiple Teachers using Visual Explanations

Mehak, Mehak and Balasubramanian, Vineeth N (2018) Knowledge Distillation from Multiple Teachers using Visual Explanations. Masters thesis, Indian Institute of Technology Hyderabad.

Thesis_Mtech_CS_4087.pdf - Submitted Version

Download (3MB) | Preview


Deep neural networks have exhibited state-of-the-art performance in many com- puter vision tasks. However, most of the top-performing convolutional neural net- works(CNN) are either very wide or deep which makes them memory and computation intensive. The main motivation of this work is to facilitate the deployment of CNNs on portable devices with low storage and computation power which can be done with model compression. We propose a novel method of knowledge distillation which is a technique for model compression. In knowledge distillation a shallow network is trained from the softened outputs of the deep teacher network. In this work, knowl- edge is distilled from multiple deep teacher neural networks to train a shallow student neural network based on the visualizations produced by the last convolutional layer of the teacher networks. The shallow student network learns from the teacher network with the best visual explanations. The student is made to mimic the teacher's log- its as well as the localization maps generated by the Grad-CAM(Gradient-weighted Class Activation Mapping). Grad-CAM takes the last convolutional layer gradients to generate the localization maps that explains the decisions made by the CNN. The important regions are illuminated in the localization map which explains the specific class predictions made by the network. Training the student with visualizations of the teacher network helps in improving the performance of the student network be- cause the student mimics the important portions of the image learned by the teacher. The experiments are performed on CIFAR-10, CIFAR-100 and Imagenet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) for the task of image classification.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Balasubramanian, Vineeth NUNSPECIFIED
Item Type: Thesis (Masters)
Uncontrolled Keywords: Model Compression, Grad-CAM, Deep Neural Network, Localization Map, Knowledge Distilation
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 27 Jun 2018 11:19
Last Modified: 02 Jun 2022 10:05
Publisher URL:
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 4087 Statistics for this ePrint Item