An audio-only Gemini Protocol browser for Linux.
The file gemini.mp3 has a pre-recorded sample of output from gemini://gemini.circumlunar.space/
- Reads text aloud using the
espeak
text-to-speech tool. - Plays a unique sound for each type of Gemtext element so that you can follow context changes in the document.
- Reads the alt text of links, but not the URL.
- Reads the alt text of preformatted text, but not the content.
I have trouble reading large blocks of text. In the past, I've devised numerous fun tricks to convert written works to audio. This is an attempt to create a script for personal use that can read gemini sites as audio rather than text. It might be useful for folks with low vision/no vision/dyslexia, but I am not sure of this and wrote the code with personal use in mind. Reach out to me if you find this genuinely useful or want to discuss new features.
This project took me one weekend to finish at a slow pace. I have incorporated the following Open Source components:
- Octave UI Sounds - Used for UI sounds, such as the click made by headings, list items, etc..
- Gemtext NPM package - I forked this package because it had extra features that were not needed for my usecase.
sudo apt-get install sox espeak
git clone https://github.com/RickCarlino/gemini-protocol-audio-reader.git
cd https://github.com/RickCarlino/gemini-protocol-audio-reader.git
npm install
Assuming you want to visit gemini://gemini.circumlunar.space/
npx ts-node app.ts gemini://gemini.circumlunar.space/
There are a number of ENV vars available to configure the application:
PLAY
: Command used to play audio files. Default is "play -q".TTS
: Command used for text to speech. Default is "espeak -s 140 -l60".PAUSE
: Command played between elements. Default is "sleep 0.5".
- Make this an NPM package so people can use it via
npx
- Ability to export pages as MP3 to read stuff later.
- Interactive mode so that the user can follow links / browse.
- Ability to use Google Cloud TTS instead of
espeak
- Ability to use the library in a non-CLI context.
- Ensure the client can pass a "torture test" https://gemini.circumlunar.space/
- Ability to handle user input (so that people can use search pages and stuff)
- Really wish I could speak links and headings in a different voice than text.
If you find this useful or want to help out, feel free to send me an email or open an issue.