Learning Representations for Image and Video Understanding

Das, Bedanta Kumar and Mohan, C Krishna (2019) Learning Representations for Image and Video Understanding. Masters thesis, Indian institute of technology Hyderabad.

[img] Text
Mtech_Thesis_TD1467_2019.pdf
Restricted to Repository staff only until 8 December 2019.

Download (4MB) | Request a copy

Abstract

Data representation is the core of all machine learning algorithms, and their performance depends mostly on the features or representations of the input on which any machine learning algorithms can be applied. Hence, to deploy a machine learning model, a considerable amount of time is invested in designing data preprocessing pipelines and data transformations that help in efficient representation of the data so that machine learning algorithms can be applied on them. Such feature engineering is costly yet essential and accentuates the shortcomings and pitfalls of machine learning algorithms, i.e., their lack of ability to extract abstract information from the input data. Feature engineering is a way to leverage human ingenuity and prior knowledge to compensate for the shortcomings of the machine learning algorithms. Hence, to make the machine learning models easily deployable and application ready, it is highly desirable to curtail the dependence of learning algorithms on engineered features so that the construction of novel algorithms can be much faster. This thesis proposes a novel approach to represent a video as a graph for action recognition and localization using only class label information. In addition to that, this thesis also proposes a novel subspace attention mechanism to learn to capture long-range inter-dependencies in visual data. This attention mechanism is implemented as a block which can be incorporated into any backbone convolution neural network.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Mohan, C KrishnaUNSPECIFIED
Item Type: Thesis (Masters)
Uncontrolled Keywords: Machine Learning, Computer vision, Representation learning, Deep learning
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 09 Jul 2019 06:47
Last Modified: 09 Jul 2019 06:47
URI: http://raiith.iith.ac.in/id/eprint/5671
Publisher URL:
Related URLs:

    Actions (login required)

    View Item View Item
    Statistics for RAIITH ePrint 5671 Statistics for this ePrint Item