Knowledge Distillation from Multiple Teachers using Visual Explanations
Mehak, Mehak and Balasubramanian, Vineeth N (2018) Knowledge Distillation from Multiple Teachers using Visual Explanations. Masters thesis, Indian Institute of Technology Hyderabad.
![]() |
Text
Thesis_Mtech_CS_4087.pdf - Submitted Version Restricted to Repository staff only until July 2020. Download (3MB) | Request a copy |
Abstract
Deep neural networks have exhibited state-of-the-art performance in many com- puter vision tasks. However, most of the top-performing convolutional neural net- works(CNN) are either very wide or deep which makes them memory and computation intensive. The main motivation of this work is to facilitate the deployment of CNNs on portable devices with low storage and computation power which can be done with model compression. We propose a novel method of knowledge distillation which is a technique for model compression. In knowledge distillation a shallow network is trained from the softened outputs of the deep teacher network. In this work, knowl- edge is distilled from multiple deep teacher neural networks to train a shallow student neural network based on the visualizations produced by the last convolutional layer of the teacher networks. The shallow student network learns from the teacher network with the best visual explanations. The student is made to mimic the teacher's log- its as well as the localization maps generated by the Grad-CAM(Gradient-weighted Class Activation Mapping). Grad-CAM takes the last convolutional layer gradients to generate the localization maps that explains the decisions made by the CNN. The important regions are illuminated in the localization map which explains the specific class predictions made by the network. Training the student with visualizations of the teacher network helps in improving the performance of the student network be- cause the student mimics the important portions of the image learned by the teacher. The experiments are performed on CIFAR-10, CIFAR-100 and Imagenet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) for the task of image classification.
IITH Creators: |
|
||||
---|---|---|---|---|---|
Item Type: | Thesis (Masters) | ||||
Uncontrolled Keywords: | Model Compression, Grad-CAM, Deep Neural Network, Localization Map, Knowledge Distilation | ||||
Subjects: | Computer science | ||||
Divisions: | Department of Computer Science & Engineering | ||||
Depositing User: | Team Library | ||||
Date Deposited: | 27 Jun 2018 11:19 | ||||
Last Modified: | 27 Jun 2018 11:19 | ||||
URI: | http://raiith.iith.ac.in/id/eprint/4087 | ||||
Publisher URL: | |||||
Related URLs: |
Actions (login required)
![]() |
View Item |
![]() |
Statistics for this ePrint Item |