Speech-to-text aligner

Python module which aligns a known transcript to the output of a speech-to-text processor. This gives each word in the transcript a timestamp and duration.

Installation

sudo python setup.py install

Usage

import sttalign
sttalign.alignJSONText('/path/to/stt-out.json', 'this is the transcript')
sttalign.alignJSON('/path/to/stt-out.json', '/path/to/transcript.json')

stt-out.json:

{
  "words": [
    {"word": "this", "start": 0.02, "end": 0.05},
    {"word": "is", "start": 0.06, "end": 0.10},
    {"word": "the", "start": 0.12, "end": 0.18},
    {"word": "transcript", "start": 0.20, "end": 0.32}
  ]
}

transcript.json:

{ "text": "this is the transcript" }

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
sttalign.py		sttalign.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-to-text aligner

Installation

Usage

About

Releases

Packages

Languages

License

bbc/stt-align

Folders and files

Latest commit

History

Repository files navigation

Speech-to-text aligner

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages