Distributed Inference With Straggler Mitigation

Krishna, Lolla Sai and Natarajan, Lakshmi Prasad (2019) Distributed Inference With Straggler Mitigation. Masters thesis, Indian institute of technology Hyderabad.


Download (3MB) | Preview


In today’s world machine learning has major applications in a wide variety of tasks such as image classification,object detection and natural language processing.Machine learning models are trained and deployed in prediction based cloud services which are mostly prediction serving systems.These systems take input requests from users & return predictions by performing inference on trained model. These services use a distributed architecture for serving user requests which consist of many nodes which are inter connected.These nodes face a number of unavailability’s such as temporary slowdowns and failures.Nodes facing temporary unavailability are known as stragglers.These nodes delay the entire process of computation. The objective of this thesis work is to design a framework for inference in a distributed setup which is robust to stragglers.The distributed setup is trained in such away that it classifies the image with good accuracy even in the presence of straggling nodes during inference. Distributed setup consists of many neural networks in parallel along with a master neural network also known as decoder which collects the prediction vetor from all nodes.The image to be classified is partitioned into as many number of parts as there are nodes and is given as input to each node.The final predictions are taken from decoder. Two neural network architectures are considered one being base-MLP model and the other one being being CNN model while implementing the distributed setup. The setup is trained for various possible cases of straggler scenarios.During training phase decoder and nodes are learned by back propagating the error & updating weights through gradient descent algorithm. During inference the setup is tested by varying the number of stragglers in the test set after training the setup for stragglers. The distributed setup classifies the image with good accuracies during during inference even in the presence of stragglers in the input as it gets trained for different possible scenarios of stragglers during training phase.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Natarajan, Lakshmi Prasadhttp://orcid.org/0000-0003-1552-5240
Item Type: Thesis (Masters)
Uncontrolled Keywords: Distributed setup, Stragglers, Straggler mitigation
Subjects: Electrical Engineering
Divisions: Department of Electrical Engineering
Depositing User: Team Library
Date Deposited: 08 Jul 2019 11:02
Last Modified: 08 Jul 2019 11:02
URI: http://raiith.iith.ac.in/id/eprint/5663
Publisher URL:
Related URLs:

    Actions (login required)

    View Item View Item
    Statistics for RAIITH ePrint 5663 Statistics for this ePrint Item