Phoneme Based Embedded Segmental K-Means for Unsupervised Term Discover

Bhati, Saurabhch and Kamper, Herman and Kodukula, Sri Rama Murty (2018) Phoneme Based Embedded Segmental K-Means for Unsupervised Term Discover. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 15-20 April 2018, Calgary; Canada.

Full text not available from this repository. (Request a copy)


Identifying and grouping the frequently occurring word-like patterns from raw acoustic waveforms is an important task in the zero resource speech processing. Embedded segmental K-means (ES-KMeans) discovers both the word boundaries and the word types from raw data. Starting from an initial set of subword boundaries, the ES-Kmeans iteratively eliminates some of the boundaries to arrive at frequently occurring longer word patterns. Notice that the initial word boundaries will not be adjusted during the process. As a result, the performance of the ES-Kmeans critically depends on the initial subword boundaries. Originally, syllable boundaries were used to initialize ES-Kmeans. In this paper, we propose to use a phoneme segmentation method that produces boundaries closer to true boundaries for ES-KMeans initialization. The use of shorter units increases the number of initial boundaries which leads to a significant increment in the computational complexity. To reduce the computational cost, we extract compact lower dimensional embeddings from an auto-encoder. The proposed algorithm is benchmarked on Zero Resource 2017 challenge, which consists of 70 hours of unlabeled data across three languages, viz. English, French, and Mandarin. The proposed algorithm outperforms the baseline system without any language-specific parameter tuning.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Kodukula, Sri Rama Murty
Item Type: Conference or Workshop Item (Paper)
Additional Information: ISSN: 15206149 ISBN: 978-153864658-8
Uncontrolled Keywords: Spoken term discovery, Unsupervised learning, Word segmentation, Zero Resource speech processing
Subjects: Materials Engineering > Testing and measurement
Materials Engineering > Materials engineering
Materials Engineering > Nanostructured materials, porous materials
Materials Engineering > Organic materials
Materials Engineering > Composite materials
Divisions: Department of Electrical Engineering
Depositing User: . LibTrainee 2021
Date Deposited: 24 May 2021 07:15
Last Modified: 24 May 2021 07:15
Publisher URL:
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 7783 Statistics for this ePrint Item