Skip to content

AuraVision (AV) is a real-time speech-to-text demo tool designed to help the deaf and hard-of-hearing community by converting spoken words into text. This demo runs on Windows, Linux, and Mac PCs and supports English, Spanish, and Farsi. The final product will be built with Vosk and Raspberry Pi for hardware deployment.

License

Notifications You must be signed in to change notification settings

Rfannn/AuraVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AuraVision (AV) 👁️‍🗨️

Version 0.2.0

AuraVision (AV) is a real-time speech-to-text demo tool designed to assist the deaf and hard-of-hearing community. It converts spoken words into text and displays them in real time. This demo is intended for use on Windows, Linux, and Mac PCs and supports English, Spanish, and Farsi. The final version will be developed as a hardware product using Vosk and Raspberry Pi.

Features 🌟

  • Real-Time Speech Recognition: Converts spoken words to text instantly. 🎙️
  • Multi-Language Support: English, Spanish, and Farsi. 🌎
  • Text Display Options: Customizable text display, including reversing text for Farsi. 📝
  • Cross-Platform: Runs on Windows, Linux, and Mac PCs. 💻
  • Flask Web Interface: Displays real-time text updates in a web browser with a simple fade-in animation. 🌐

Installation 🛠️

  1. Clone the Repository:

    git clone https://github.com/Rfannn/AuraVision.git
    cd AuraVision
  2. Install Dependencies: Ensure Python 3.x is installed, then install the required packages:

    pip install -r requirements.txt
  3. Download Language Models: Download Vosk language models and place them in your desired directory. Vosk Models.

  4. Set Up Paths: Make sure to set the correct paths for the Vosk models in the code:

if lang == 'en':
    model_path = "your-path-to-english-model"
elif lang == 'es':
    model_path = "your-path-to-spanish-model"
elif lang == 'fa':
    model_path = "your-path-to-farsi-model"
  1. Alternative Installation (Windows):

    • Run init.bat to automatically set up the environment and install dependencies.
    • Note: Make sure you have administrative privileges to execute batch files.
  2. Alternative Installation (Linux/Mac):

    • Run init.sh to set up the environment and install dependencies.
    • Remember to give execute permissions to the shell script:
      chmod +x init.sh

Usage 🚀

  1. Run the Flask Server:
python app.py
  1. Run the Speech Recognition Script:
python main.py
  1. Choose Language: Enter the language code when prompted:

The script will load the appropriate model and start processing audio.

  1. View Text: Text will appear in the console. For Farsi, text will be reversed. Or you could open your web browser and navigate to http://localhost:5000 to see the real-time text updates with a simple fade-in animation.

Configuration ⚙️

Customize the model paths and other settings in the script as needed. Ensure the paths to language models are correct.

Future Development 🌐

This demo is designed for PC platforms (Windows, Linux, Mac) and is a precursor to a hardware product that will use Vosk and Raspberry Pi. Stay tuned for updates on the hardware version of AuraVision! 🛠️

Contributing 🤝

Contributions are welcome! Please fork the repository and submit a pull request for any improvements or fixes. Open an issue on GitHub for questions or feature requests.

License 📜

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements 🙌

  • Vosk for speech recognition.
  • Deep Translator for language translation.
  • Colorama for colored text output.
  • Flask for displaying text output on external devices.

Contact 📬

Feel free to reach out to me via any of the following channels:

Looking forward to connecting with you! 😊👍

About

AuraVision (AV) is a real-time speech-to-text demo tool designed to help the deaf and hard-of-hearing community by converting spoken words into text. This demo runs on Windows, Linux, and Mac PCs and supports English, Spanish, and Farsi. The final product will be built with Vosk and Raspberry Pi for hardware deployment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published