Skip to content

Latest commit

 

History

History
50 lines (39 loc) · 2.5 KB

README.md

File metadata and controls

50 lines (39 loc) · 2.5 KB

punkProse_ASR-demo

This is a demo software that contains scripts to punctuate audio recordings using punkProse library. It is inteded to use for demonstration purposes. It is developed and tested in Mac but it should run on any UNIX based OS.

Installation

Requirements:

Setup:

Install the required python packages. Install Montreal forced aligner and link the binaries and models (MFA_ALIGN_BINARY, MFA_LEXICON, MFA_LM) in microphone_recognition.py. Praat should be installed and accessible from command line as praat.

Speech recognition is done through Python package speech_recognition. Current setting uses Google Cloud Speech API as the recognition engine. The credentials needs to be put in a file named credentials.py with the name GOOGLE_CLOUD_SPEECH_CREDENTIALS.

Currently two English punctuation models are provided under the directory models. One that is trained on words only and another one with prosodic features pause, mean f0 and POS features.

Run

In order to run type: python listen_and_punctuate.py

You can choose either to record from microphone or open a pre-recorded audio file. Raw and punctuated proscripts will be created under the directory rec.

Visualizing output on Prosograph

In order to visualize the recordings install Prosograph and link the directory rec in file dataconfig_newdata.py under Prosograph. You can switch between different punctuation outputs using number keys.

Sample demo setup

Demo setup with Prosograph

Read more

This demo was presented in Interspeech 2018: Link

@inproceedings{punkProse,
	author = {Alp Oktem and Mireia Farrus and Antonio Bonafonte},
	title = {Visualizing Punctuation Restoration in Speech Transcripts with Prosograph},
	booktitle = {Interspeech 2018},
	year = {2018},
	address = {Hyderabad, India}
}