Embedded Segmental K-Means for ZeroSpeech2017 Track 2

Warning

This is a preliminary version of our system. This is not a final recipe, and is still being worked on.

Overview

A description of the challenge can be found here: http://sapience.dec.ens.fr/bootphon/2017/index.html.

Disclaimer

The code provided here is not pretty. But I believe that research should be reproducible, and I hope that this repository is sufficient to make this possible for the paper mentioned above. I provide no guarantees with the code, but please let me know if you have any problems, find bugs or have general comments.

Preliminaries

Clone the zerospeech repositories:

mkdir ../src/
git clone https://github.com/bootphon/zerospeech2017.git \
    ../src/zerospeech2017/
# To-do: add installation and data download instructions
git clone https://github.com/bootphon/zerospeech2017_surprise.git \
    ../src/zerospeech2017_surprise/

Clone the eskmeans repository:

git clone https://github.com/kamperh/eskmeans.git \
    ../src/eskmeans/

Get the surprise data:

cd ../src/zerospeech2017_surprise/
source download_surprise_data.sh \
    /share/data/lang/users/kamperh/zerospeech2017/data/surprise/
cd -

Update all the paths in paths.py to match your directory structure.

Feature extraction

Extract MFCC features by running the steps in features/readme.md.

Unsupervised syllable boundary detection

We use the unsupervised syllable boundary detection algorithm described in:

O. J. Räsänen, G. Doyle, and M. C. Frank, "Unsupervised word discovery from speech using automatic segmentation into syllable-like units," in Proc. Interspeech, 2015.

Obtain the syllabe boundaries by running the steps in syllables/readme.md.

Acoustic word embeddings: downsampling

We use one of the simplest methods to obtain acoustic word embeddings: downsampling. Different types of input features can be used. Run the steps in downsample/readme.md.

Unsupervised segmentation and clustering

Segmentation and clustering is performed using the ESKMeans package. Run the steps in segmentation/readme.md.

Dependencies

Python
NumPy and SciPy.
HTK: Used for MFCC feature extraction.
Matlab: Used for syllable boundary detection.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embedded Segmental K-Means for ZeroSpeech2017 Track 2

Warning

Overview

Disclaimer

Preliminaries

Feature extraction

Unsupervised syllable boundary detection

Acoustic word embeddings: downsampling

Unsupervised segmentation and clustering

Dependencies

Collaborators

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
downsample		downsample
eval/abx		eval/abx
features		features
segmentation		segmentation
submission		submission
syllables		syllables
.gitignore		.gitignore
paths.py		paths.py
readme.md		readme.md

kamperh/recipe_zs2017_track2

Folders and files

Latest commit

History

Repository files navigation

Embedded Segmental K-Means for ZeroSpeech2017 Track 2

Warning

Overview

Disclaimer

Preliminaries

Feature extraction

Unsupervised syllable boundary detection

Acoustic word embeddings: downsampling

Unsupervised segmentation and clustering

Dependencies

Collaborators

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages