timit-preprocessor extract mfcc vectors and phones from TIMIT dataset for advanced use on speech recognition.
The TIMIT corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of automatic speech recognition systems. More information on website or Wiki
Note that to install Kaldi first by following the instructions in INSTALL
.
(1)
go to tools/ and follow INSTALL instructions there.(2) go to src/ and follow INSTALL instructions there.
After running the scripts instructed by INSTALL
in tools/
, there will be reminder as followed. Go and run it.
Kaldi Warning: IRSTLM is not installed by default anymore. If you need IRSTLM, use the script
extras/install_irstlm.sh
After ensuring kaldi installation, we can start by running
git clone https://github.com/orbxball/timit-preprocessor.git
-
Run
./convert_wav.sh
only in the first time after cloning this repo. -
python3 parsing.py -h
to see instructions parsing timit dataset for phone labels and raw intermediate files in folderdata/material/
. -
./extract_mfcc.sh
to extract mfcc vectors into .scp and .ark files.
Finally, there's a folder called data/
which contains all the outcomes in the belowing directory structure:
data/
|-- material
| |-- test.lbl
| `-- train.lbl
`-- processed
|-- test.39.cmvn.ark
|-- test.39.cmvn.scp
|-- test.extract.log
|-- train.39.cmvn.ark
|-- train.39.cmvn.scp
`-- train.extract.log
If you want to do further operations, there's a good repo called kaldi-io-for-python.
Feel free to contact me if there's any problems.
BSD 3-Clause License (2017), Jun-You Liu